18-746/15-746: Storage Systems (Fall 2016)

FAQs will be posted here. Keep checking this page.

LAB 2 (CloudFS) is out

Question 1: Are the slides on Project 2 available?

Answer:
Yes. See the lectures web.

Question 2: Username/Password for logging into the Virtual Machine?

Answer:
Username: student
Password: password
This account is not a superuser account. You have been given superuser permissions for only a few select commands (you won't explicitly have to use those though).

Question 3: Is C++ okay?

Answer:
Yes. You can develop your project in C++. So long as you can compile, build and run the project successfully on the VM we have supplied (without installing any additional tools), your project will be graded on Autolab. You may have to change the Makefile to acheive this. You are not allowed to develop cloudfs in languages other than C / C++.

Question 4: How to debug my cloudfs outside the tests provided?

Answer:
1) initialize your cloudfs -> ./reset.sh and then ./mount_disks.sh at /src/scripts
2) run s3 server -> ./run_server at /src/s3-server/
3) Uncomment the line that adds the -f parameter to cloudfs in cloudfs_start (this parameter allows us to run cloudfs in foreground). Run make.
4) Run cloudfs in foreground -> ./cloudfs —no-dedup (for checkpoint 1)) at /src/build/bin/.
5) cd ~/mnt/fuse. Now run applications or system calls like touch, echo, cat, ls, chmod etc. Observe the system calls invoked, how your cloudfs responds.

Question 5: Using extended attributes gives me a ENOTSUP.

Answer:
You need to specify the attribute class to be able to correctly set an extended attribute.
Please consult the attr man page.

Question 6: lost+found directory causing tests to fail?

Answer:
lost+found is a directory created by the Ext4 formatting tools. In fact a lost+found exists when you format a filesystem as any of the Ext filesystems (it may even exist on your own machines!). In our scripts you are required to ignore lost+found, i.e. correctness requires you to *not* enlist lost+found as one of the existing folders when an 'ls' is performed on the FUSE mountpoint. The changes required to ignore lost+found must be made entirely in your FUSE implementation. Please do not modify the scripts you are provided as doing that might result in autograding problems on Autolab.

Question 7: I keep seeing 'Transport Endpoint not connected' errors. What's going on?

Answer:
This generally indicates that something in your code is broken, and has caused CloudFS to throw an error, return a bad status, or crash in some way or the other. (This generally translates to a segmentation fault because of which CloudFS has crashed). The FUSE module is unable to reach your user-level code and can't send it captured filesystem events. You need to log sufficient information (locally, please!) to catch which function is failing. Running a ./reset.sh in the scripts directory will get stuff back to normal.

Question 8: How am I supposed to log in cloudfs?

Answer:
Logging done in /tmp/cloudfs.log will be relayed to you via Autolab. Remember, logging too much can crash autolab and the results of too much logging are unpredictable.

Question 9: How to submit code for cloudFS ?

Answer:
You need to upload a .tar of the src folder.
tar -cvf src.tar src/

Question 10: Hash Header file

Answer:
In order to maintain hashes in cloudfs implemented in C, uthash.h can be used. If you are using C++ there are hash table functions in STL, called maps.

For usage refer to : http://troydhanson.github.io/uthash
Students are allowed to use only uthash.h from this link.

One student asked and was granted permission to use this file. Others can do the same.
Decision to use this file is entirely the responsibility of the student.

Question 11: How do I run the provided tests?

Answer:
Please run "make test_1_n" from the src/ directory, where n is the test number.

Question 12: Crash Recovery: what are the requirements for performing crash recovery in CloudFS?

Answer:
Our main intention behind testing crash recovery is to discourage memory-only implementations. Therefore, you can rest assured that we will crash and restart CloudFS only at "reasonable" points in time -- for e.g. after all files are closed (which therefore might be a good time for CloudFS to back-up all of its in-memory metadata onto SSD). We won't be crashing CloudFS while a write operation is in progess (or any operation that might lead to a change in the metadata).

Also, we won't be testing Crash Recovery for checkpoint 1, but we will be testing it for checkpoints 2 and 3.

NEW Question 13: How many segments should we download from the cloud for a write on a chunked file?

Answer:
Instead of thinking about the answer to this question, think about the high-level requirement: at any point, the segments that represent the file in its current state should be the same segments that the Rabin Hash algorithm would have generated as if run from the beginning of the file.
To give an example, suppose you have a file that has 6 segments in the cloud, and a write spans from the middle of the 2nd segment to the middle of the 4th segment, and completely encompasses the 3rd segment. In that case, the segments that would definitely end up changing will be the segments numbered 2 and 4, whereas segment 3 will probably vanish from the new file if the data is different (segment 1 would definitely not change). (Thought experiment: think about whether you would really need to download the 3rd segment from the cloud). Now since segment 4 will change, it may end up moving the boundary for segment 4, leading to a change in segment 5, so you might have to bring the 5th segment as well! But would you need to bring in segment 6? Maybe, but most probably not. This is an artifact of the parameters that we are using for chunking the files: the Rabin Window size is always much much smaller than the segment sizes, and therefore after 1 or 2 chunks, you would observe the exact same boundaries that the previous content gave - because that data didn't change! Therefore, you should write code that can handle segment boundaries changing due to a write, and your code should also be able to detect when to stop (which should probably be after 1-2 chunks, depending on when the segment boundaries no longer change).

NEW Question 14: Autolab submission error: Error message: 795: unexpected token at ': Error deleting user's files at line 397'

Answer:
This happens when you are logging too much. Please try to reduce the amount of logs.

NEW Question 15: Autolab submission error: Error: Copy in to VM failed (status=-1)

Answer:
This error is non-deterministic, which means you can solve it by resubmitting. Also, we gave everyone five more submissions to make up for the lost submissions. We are trying to figure out the cause of this error and we are sorry about the inconvenience.

LAB 1 (myFTL)

Question 1: How do I know if I have used up too much heap space in my FTL implementation?

Answer:
You can use the massif tool that comes with the valgrind suite. Refer to the massif manual for more information on how to run it. Remember, naive designs with large memory usage would result in point deductions, so it is important that you optimize in-memory utilization.

Question 2 : How do I submit my code to autolab?

Answer:
First of all, as the handout says you can only modify the myFTL.cpp file. All your changes go into this. When you want to submit the code on autolab, just submit this file in autolab.

Question 3:Why is there block-level mapping in the overprovisioned space, and not page-level mapping?

Answer:
The reason is that if we map pages of multiple data blocks into a single log-reservation block, when we have to clean, all the data blocks have to be erased (the ones whose pages are in the log-reservation block). Erases are at least an order of magnitude slower than reads / writes. This introduces sudden spikes of latency which are undesirable. There are ways to get around this spike, but I believe they are too complicated for this project. So, the policy we have chosen here is to choose a log-reservation block to buffer rewrites to a single data block, i.e. a log-reservation block cannot buffer writes from multiple data blocks.

Question 4 :Do the configuration knobs (ssd size, package size, die size, plane size and block size) have to be powers of 2?

Answer:
No, it is not necessary that they be powers of 2. And, we will ensure that the percentage of overprovisioned space does not result in a fractional number of usable LBAs.

Question 5 :Is your FTL supposed to support multi-threaded tests?

Answer:
No, for the purpose of this project, the tests will be single-threaded.

Question 6 :Can I access/modify the emulator classes?

Answer:
No. You are not supposed to access or modify the emulator classes in any way. The provided instance of ExecCallBack is sufficient to perform everything you may need to do apart from mapping.

Question 7 :Will my code be graded for style?

Answer:
Yes, your code will be graded for style. We will use the standard 15-213 coding standards for grading your code.

Question 8 :Can I use C++ standard library or other libraries?

Answer:
You are free to use C++ standard library. But if you have to use libraries other than the standard library, consult the course staff first.

Question 9 :How can I move data around, from PBA0 to PBA1? Why READ returns void instead of the data?

Answer:
To move the data from PBA0 to PBA1, you should do
func(FlashSim::OpCode::READ, PBA0);
func(FlashSim::OpCode::WRITE, PBA1);
i.e. myFTL first reads the data from PBA0 to the buffer and then writes from the buffer to PBA1. The buffer is inside the simulator, so myFTL does not directly access the data. Note that the number of READs should match the the number of WRITEs when moving data.

Question 10 :Why did my code fail with "ERROR: free() illegal pointer"?

Answer:
That's because you are calling free or delete with an illegal argument. The most common misuse of delete is to call delete instead of delete[] on a dynamically allocated array. Please make sure that you don't have such errors. You can test your code by running your code through valgrind before submitting on Autolab. (Many checkpoint 2 submissions failed to get a non-zero score on our checkpoint 3 tests because they were using delete in an incorrect way. Be doubly sure that you are not making this mistake).
We have also received reports of code failing with this error even though valgrind shows no issues. We've identified this issue as an issue with "calloc". We request you to not use "calloc" in your code, and instead use malloc with memset.

Question 11 :Why does the framework print "Oh no! Hook is null."

Answer:
Please don't run "make checkpoint_3" or "make test_3_n". You'll have to run "make run_memtool_test_3_n" instead of "make test_3_n", where n is the test number.

Question 12 :Why do the configuration files contain a SELECTED_GC_POLICY even though we've been encouraged to design our own cleaning policies?

Answer:
We made a small cosmetic mistake (mostly intentional, to allow backward compatability with checkpoint 2 codes). You can safely ignore the value of that parameter.

Question 13 :I got an exception when trying to free a pointer that was calloc'ed. How to solve it?

Answer:
Currently our mock library of malloc, libmockmalloc, does not support calloc. Please use a combination of malloc and memset to bypass this issue.

Question 14 :Can I use xxx design in checkpoint 3? Can I optimize according to a specific test case?

Answer:
You are free to use any design in checkpoint 3 and you can safely ignore all design constraints in previous checkpoints. However, you are responsible for justifying all your design in your report. For example, "if it is test case 10, do something" is not a reasonable justification.

Question 15 :Do I have to write my final report in double-column format?

Answer:
Nope. We just gave a link to a double-column paper for those who might be interested in writing their report that way. You are free to choose whatever format you want. But, please do adhere to the rest of the guidelines, like keeping it down to below 3 pages, writing your name and Andrew ID at the top etc.

Question 16 :Will there be any writeup for the final report?

Answer:
Nope. The email that was sent to 746-announce (titled Final report for project 1 (myFTL)) has all the required information for writing the report.

18-746/15-746: Storage Systems (Fall 2016)

FAQs

Sample menu:

LAB 2 (CloudFS) is out

LAB 1 (myFTL)