LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 12-09-2022, 07:10 PM   #1
joe_sleeping
LQ Newbie
 
Registered: Dec 2022
Posts: 4

Rep: Reputation: 1
After memory-mapping, the process still consumes physical memory when there is a cache


I'm trying to understand mmap. As i know, mmap should map virtual address to page cache & thus there is no need to copy data from page cache to a process's virtual memory, and eventually there is a single copy of data in the whole machine.

However, when I try to mmap and read it, I can see the memory increases twice as file reading size, do I interpret it incorrectly or anything wrong about my code?

Memory consumption before testing:
Code:
$ free -m
               total        used        free      shared  buff/cache   available
Mem:            3924        1391        2280          13         251        2292
Swap:              0           0           0
I run below python code:
Code:
import mmap
import os
  
import time
# file2.db is a 2 GB file
with open("/var/tmp/file2.db", "r") as f:
  with mmap.mmap(f.fileno(), 0, prot=mmap.PROT_READ) as mm:
    x= mm.read(500000000)
      time.sleep(10000)
Code:
$ python3 mmap_read.py &
Memory consumption after testing:
Code:
$ free -m
               total        used        free      shared  buff/cache   available
Mem:            3924        1703        1575          13         644        1980
Swap:              0           0           0
I further check syscall used by process, looks like there is no data copy
Code:
$ sudo perf record python3 mmap_read.py & # record syscall
$ sudo perf report
Result
Code:
Samples: 128  of event 'cpu-clock:pppH', Event count (approx.): 1292929280
Overhead  Command  Shared Object      Symbol
  24.22%  python3  [kernel.kallsyms]  [k] do_user_addr_fault
   4.69%  python3  [kernel.kallsyms]  [k] rmqueue
   3.91%  python3  [kernel.kallsyms]  [k] __add_to_page_cache_locked
   3.91%  python3  [kernel.kallsyms]  [k] charge_memcg
   3.91%  python3  libc.so.6          [.] 0x00000000001a0e81
   3.12%  python3  [kernel.kallsyms]  [k] __lock_text_start
   3.12%  python3  [kernel.kallsyms]  [k] xas_load
   3.12%  python3  libc.so.6          [.] 0x00000000001a0ef0
   2.34%  python3  [kernel.kallsyms]  [k] __mod_lruvec_state
   2.34%  python3  [kernel.kallsyms]  [k] do_anonymous_page
   2.34%  python3  [kernel.kallsyms]  [k] free_unref_page_list
   2.34%  python3  [kernel.kallsyms]  [k] release_pages
   2.34%  python3  libc.so.6          [.] 0x00000000001a0e6f
   1.56%  python3  [kernel.kallsyms]  [k] __cgroup_throttle_swaprate
   1.56%  python3  [kernel.kallsyms]  [k] __mod_node_page_state
   1.56%  python3  [kernel.kallsyms]  [k] filemap_map_pages
   1.56%  python3  [kernel.kallsyms]  [k] pmd_page_vaddr
   1.56%  python3  [kernel.kallsyms]  [k] pmd_pfn
   1.56%  python3  [kernel.kallsyms]  [k] xa_get_order
   1.56%  python3  libc.so.6          [.] 0x00000000001a0e4c
   1.56%  python3  libc.so.6          [.] 0x00000000001a0e7d
   1.56%  python3  libc.so.6          [.] 0x00000000001a0e86
   1.56%  python3  libc.so.6          [.] 0x00000000001a0f47
   0.78%  python3  [kernel.kallsyms]  [k] __bio_add_page
   0.78%  python3  [kernel.kallsyms]  [k] __handle_mm_fault
   0.78%  python3  [kernel.kallsyms]  [k] __mem_cgroup_charge
   0.78%  python3  [kernel.kallsyms]  [k] __page_set_anon_rmap
   0.78%  python3  [kernel.kallsyms]  [k] arch_local_irq_enable
   0.78%  python3  [kernel.kallsyms]  [k] blk_mq_dispatch_rq_list
   0.78%  python3  [kernel.kallsyms]  [k] cgroup_rstat_updated
   0.78%  python3  [kernel.kallsyms]  [k] clear_page_erms
   0.78%  python3  [kernel.kallsyms]  [k] do_set_pte
   0.78%  python3  [kernel.kallsyms]  [k] elv_rqhash_add
   0.78%  python3  [kernel.kallsyms]  [k] finish_task_switch.isra.0
   0.78%  python3  [kernel.kallsyms]  [k] get_mem_cgroup_from_mm
   0.78%  python3  [kernel.kallsyms]  [k] handle_mm_fault
   0.78%  python3  [kernel.kallsyms]  [k] handle_pte_fault
   0.78%  python3  [kernel.kallsyms]  [k] kthread_blkcg
   0.78%  python3  [kernel.kallsyms]  [k] page_counter_try_charge
   0.78%  python3  [kernel.kallsyms]  [k] pmd_val
   0.78%  python3  [kernel.kallsyms]  [k] try_charge_memcg
   0.78%  python3  [kernel.kallsyms]  [k] xas_find
   0.78%  python3  [kernel.kallsyms]  [k] zap_pte_range
   0.78%  python3  libc.so.6          [.] 0x00000000001a0e5a
   0.78%  python3  libc.so.6          [.] 0x00000000001a0e76
   0.78%  python3  libc.so.6          [.] 0x00000000001a0f02
   0.78%  python3  libc.so.6          [.] 0x00000000001a0f07
   0.78%  python3  libc.so.6          [.] 0x00000000001a0f27
   0.78%  python3  libc.so.6          [.] 0x00000000001a0f57
   0.78%  python3  libc.so.6          [.] 0x00000000001a0f5f
   0.78%  python3  python3.10         [.] 0x000000000012161a
   0.78%  python3  python3.10         [.] 0x000000000012d084
I would expect the buff/cache grows & used is the same, as the process should reference the data in page cache, any idea on that? Any help is appreciated.
 
Old 12-10-2022, 09:31 AM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,039

Rep: Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347
I think you mixed some things or at least I don't understand what do you mean by page cache. https://www.sobyte.net/post/2022-03/mmap/
 
Old 12-10-2022, 07:28 PM   #3
joe_sleeping
LQ Newbie
 
Registered: Dec 2022
Posts: 4

Original Poster
Rep: Reputation: 1
Hi pan64, thank you for your reply.

The page cache I meant is the cache linux kernel store after reading file data. For example, when you issue read syscall, kernel will copy the data from disk, cache it and copy it to the process's virtual memory.

I do agree that I probably mix up something, but no idea what do I understand incorrectly.
 
Old 12-11-2022, 03:15 AM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,039

Rep: Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347
mmap and that cache are two independent things. mmap does not work on that (or with) that cache. This cache is mostly completely invisible from this point of view. (if I understand well)
 
Old 12-12-2022, 05:40 AM   #5
joe_sleeping
LQ Newbie
 
Registered: Dec 2022
Posts: 4

Original Poster
Rep: Reputation: 1
I figured what's wrong with my code. Python allocates a buffer in the process & copy the cache into that buffer. I created another c program and it uses the page cache.

Code:
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <fcntl.h>


int main(void) {
   int fd = open("/var/tmp/large", O_RDONLY); // /vart/tmp/large is a large file
   size_t size;
   struct stat statbuf;
   int err = fstat(fd, &statbuf);
   size = statbuf.st_size;
   // size = 4096;
   char * region = mmap(
     NULL, size,
     PROT_READ, MAP_SHARED,
     fd, 0
   );
   printf("%s", region);
   int unmap_result = munmap(region, size);
   close(fd);
   return 0;
}
Before running that code:
Code:
$ free -m
               total        used        free      shared  buff/cache   available
Mem:            3924        1221        2204          11         498        2467
Swap:              0           0           0
$ vmtouch /var/tmp/large
           Files: 1
     Directories: 0
  Resident Pages: 30725/488282  120M/1G  6.29%
         Elapsed: 0.008021 seconds
During running that code, buff/cache value goes up, used value unchanged
Code:
$ free -m
               total        used        free      shared  buff/cache   available
Mem:            3924        1210         727          11        1985        2477
Swap:              0           0           0
During running that code, heap size is unchanged while Referenced value of large (i.e. /var/tmp/larget) goes up
Code:
$ pmap -X <<PID>>
   Inode    Size     Rss     Pss Referenced  Anonymous Mapping
       0     132       4       4          4          0  [heap]
 1048884 1953128 1953128 1953128    1951100          0   large
After running that code, entire file is cached
Code:
$ vmtouch /var/tmp/large
UbuntuVM:~/Desktop/c-test$ vmtouch /var/tmp/large
           Files: 1
     Directories: 0
  Resident Pages: 488282/488282  1G/1G  100%
         Elapsed: 0.014291 seconds
 
1 members found this post helpful.
Old 12-12-2022, 05:47 AM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,152

Rep: Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125Reputation: 4125
Quote:
Originally Posted by joe_sleeping View Post
Python allocates a buffer in the process & copy the cache into that buffer.
Thanks for the update - I was going to suggest python itself was the likely culprit, but with no experience, couldn't really comment.

You might like to look at bpf for finer (more targetted) tracing capabilities.
 
Old 12-12-2022, 05:48 AM   #7
joe_sleeping
LQ Newbie
 
Registered: Dec 2022
Posts: 4

Original Poster
Rep: Reputation: 1
Hi pan64,

I don't think so. My understanding is as below.

Assuming there is no page cache in both cases,
1. Normal file read copies data pull data from disk to kernel memory space, it then copies the data from kernel to process's memory space.
2. After memory mapping a file, when we try to read some part of the file in a process, the process will reference its page table & see if the data exists or not. If the data does not exist, page fault will happen. Data will be copied to kernel memory and page table is updated. Then the process can reference the page table and see the data. That's why "used" value is unchanged while page cache goes up when we read a mmap-ed file

If there is any misunderstanding, appreciate if you can point it out.
 
Old 12-12-2022, 06:27 AM   #8
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,039

Rep: Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347
The full memory (including code, data, whatever) used by a process can be put into cache, not only the mmapped parts. The process itself is not responsible for the usage of cache, but the kernel itself.
What you will see highly depend on the load of the system. Also kernel will try to use all the available ram for caching, if possible (but obviously won't do that if there is no more ram to use). So I don't really think it is that simple.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Is there a way to insert an arbitrary virtual to physical address mapping? Joe Loos Linux - Kernel 1 04-20-2011 07:58 PM
[SOLVED] Script that names the process in the system that consumes the most memory josecolella Linux - Newbie 4 01-03-2011 07:23 AM
gil = TCP/IP process consumes a lot of CPU pete83 AIX 1 05-09-2008 02:57 AM
mechanics of mapping process memory addresses to physical addresses on amd64 Tischbein Linux - Kernel 2 02-01-2007 08:09 PM
physical scsi channel mapping to scsiX device node mapping, how to configure manually drthornt Linux - Hardware 3 02-09-2003 11:50 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 07:32 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration