A PDF version is available here.
Friday, November 7, 10pm
This assignment builds on Assignment 5. Use your submission for it as a starter for this assignment. We will provide a sample implementation of Assignment 5 five days after its deadline. This also means that we will not re-evaluate parts relevant to A5 for this assignment.
This is a pair assignment, but you can work alone, if you so choose.
Submit the contents of your repository via Gradescope. See Deliverables below for what to submit. If you are working with a partner, do not forget to include their name with the submission. Only submit one solution for both partners.
There will be no autograder for this assignment ahead of the deadline. Read the requirements and run tests locally.
For this assignment, we will write a better version of the memory allocator from Assignment 5.
mmap to Allocate Page-sized
BlocksmmapFor the basic allocator implemented in Assignment 5, you
were asked to use the sbrk system call to change
the size of the heap. This call is not the only way to request memory
for our process. In fact, the sbrk syscall is considered
deprecated and its use is discouraged. We have used it in our
first allocator because it provides a simple and easy-to-use interface
for requesting heap memory from the OS.
So what does a modern Unix/Linux want us to use for memory
allocation? The modern way to allocate memory is to directly request
pages for a process using the mmap system call. In
general, mmap allows us to map a file or a device into
memory, meaning that any reads/writes within the memory region returned
from a successful mmap call are reflected in the file (more
on this in the file system project). In other words, mmap
allows us to allocate memory from the operating system, backed by a
particular file. On Linux and some other operating systems, we can even
request “anonymous” mappings, meaning that the returned region is not
backed by any file. In this mode, mmap can behave like
malloc.
An “mmapped” region’s size is always aligned along the boundary of a
page.
This means that one constraint is that allocations with
mmap should be made in multiples the size of a page. Even
if you specify a size that is not a multiple of the system’s page size,
mmap will round up to the nearest page. For example, if our
page size is 4KB (4096 bytes), and we only use 12 bytes in total during
our program’s execution, we have quite a bit of waste! In practice, for
large desktop applications this is not a major issue.
Our task is to update our allocator implementation to use
mmap instead of sbrk. Remember that, unlike
sbrk, mmap will return a pointer to the
beginning of a page of memory. Here is the overall updated strategy for
mymalloc and myfree. In the text below,
PAGE_SIZE refers to the system’s page size which you can
find out using the sysconf routine (see its manpage).
Do not hard-code this value anywhere.
mymallocPAGE_SIZE - sizeof(block_t) (aka “small block”):
mmap to
request a new page and set it up as a block.sizeof(block_t) bytes of user memory. That means, the free
list shouldn’t contain blocks that are smaller than the header
size.PAGE_SIZE - sizeof(block_t) (aka “large block”):
PAGE_SIZE.myfreePAGE_SIZE - sizeof(block_t):
add the block to your free list and coalesce (see below) if needed. Keep
at most 2 page-sized (i.e., the user’s block size is
PAGE_SIZE - sizeof(block_t)) blocks in your free list. Use
munmap to unmap extra blocks that take up a page.PAGE_SIZE - sizeof(block_t): use munmap to
unmap it.mycallocThe calloc function guarantees that a block it returns
only contains null bytes. However, reading the manpage for
mmap you will notice that pages requested with
MAP_ANONYMOUS are already initialized to zero when returned
from the OS. Only use memset when needed to zero blocks you
are returning from calloc.
When adding a block into your free list, keep the list sorted by the memory address of the blocks. This will allow coalescing: whenever two blocks in the free list form a continuous area of memory, they should be merged into one block (coalesced)
Since you insert into the free list and need to handle this in two different places, a helper function is a good idea.
In addition to the original debug messages, use
debug_printf to log the following messages from
mymalloc, mycalloc, and
myfree.
| Function | small / large | Condition / Event | Printout |
|---|---|---|---|
mymalloc |
small | exact block is found in free list | "malloc: block of size %zu found\n" |
mymalloc |
small | no block found | "malloc: block of size %zu not found - calling mmap\n" |
mymalloc |
small | splitting block | "malloc: splitting - blocks of size %zu and %zu created\n" |
mymalloc |
large | mmap request | "malloc: large block - mmap region of size %zu\n" |
myfree |
small | coalescing | "free: coalesce blocks of size %zu and %zu to new block of size %zu\n" |
myfree |
large | unmaping | "free: munmap region of size %zu" |
For example, let’s say that we are servicing the first
malloc request after the program was started and the
request size is 1024 bytes, header size is 16 bytes, the system page
size is 4096 bytes. Then the following log trace should be visible:
DEBUG: Malloc 1024 bytes
DEBUG: malloc: block of size 1024 not found - calling mmap
DEBUG: malloc: splitting - blocks of size 1024 and 3040 created
DEBUG: malloc: block of size 1024 found
If free was called immediately after, we get the
following messages:
DEBUG: free: coalesce blocks of size 1024 and 3040 to new block of size 4080
DEBUG: Freed 1024 bytes
Here’s a large block example with malloc followed by a
free of the same block. Request is for 12000 bytes, header
is 16 bytes, page size is 4096:
DEBUG: Malloc 12000 bytes
DEBUG: malloc: large block - mmap region of size 12288
...
DEBUG: free: munmap region of size 12288
DEBUG: Freed 12288 bytes
mymalloc.c and include
any additional .c and .h files your
implementation relies on. For example, you might want to compile your
helper data structure separately.
Commit the code to your repository. Do not include any executables,
.o files, or other binary, temporary, or hidden files.
Once you are done, remember to submit your solution to Gradescope and do not forget to include your partner.
man pages of mmap,
munmap, malloc, calloc,
free, realloc, …mmap arguments. Request multiples of
PAGE_SIZE. You’ll want an anonymous
private mapping, that is both readable and
writable. The flags you’re looking for are
MAP_PRIVATE and MAP_ANONYMOUS, and the
protection PROT_READ and PROT_WRITE. For
anonymous mappings, use -1 as the file descriptor. Offset
should be 0.assert to check that your
assumptions about state are valid.asserts to check expected results. Use our tests for
queue/vector from Assignment 4 as an example.if-else if-else or a multi-case
switch should
be the only reason to go beyond 40-50 lines per function. Even so, the
body of each branch/case should be at most 3-5 lines long.memset (if you are using
memset in calloc) that you are not
memset-ing over your block header.mallocs and frees, mallocs of a
wide range of sizes to exercise the two block size strategies, in
particular the edge cases.sysconf(_SC_PAGE_SIZE)
to get the OS’s page size. Check the manpage for sysconf to
see which header you need to include.0, meaning a freshly mmapped page does not need to be
memset to initialize it.sbrk calls
are no longer valid. Because we are releasing large blocks of memory to
the OS and coalescing smaller blocks, tests 6 and 7 might be
prohibitively slow. In test1, change the definition of
ARRAY_ELEMENTS (line 12 in the
original version) from 1024 to 512.Q: If you need to split a block but the amount of
remaining memory in the block is less than the amount of memory of a new
block header (which we need to split the remaining memory into a new
block), what should we do?
A: In this case you do not need to do anything, that is
an acceptable amount of fragmentation to live with.