Benchmarking filesystems

By Crazy ยท 5 replies
Apr 3, 2008
  1. Hey,

    I'm doing my internship atm and one of the projects I have to do is setting up a clustered file-system.
    I'm using glusterfs, so now I want to run some benchmarks on it.

    The question is, what tools are best to use and more importantly, how to use them.
    For benchmarking I found bonnie++ and iozone(I'm using fedora8 atm). But how to use it properlly?

    The thing I have to do is write many small files to the filesystem. They will get written to different storage servers.

    I'm new to linux stuff, so any help will be appriciated!
  2. Nodsu

    Nodsu TS Rookie Posts: 5,837   +6

    Hum.. man bonnie++? man iozone?

    The sad truth is that there is no "proper" way to run these things, because it all depends on what and how you are benchmarking. The main guideline would be to come up with a set of benchmarks that are the same and fair for all test cases.

    Also, bonnie and iozone are artificial benchmarks and therefore almost useless. You should consider setting up something real-life-like and get some useful results with a realistic workload based on some assumed deployment scenario for your filesystem.
  3. Crazy

    Crazy TS Rookie Topic Starter Posts: 139

    To get close to what is happening now in production would be to create allot of new files from 100kb to 1mb in /home/import where I mounted the filesystem.
    Timing that would create the best benchmark.

    But how to do it?, I'm pretty inexperianced with linux :(, still learning new things each day :)

    The place I'm doing my internship is here
    I would be testing it to see if it's doable in production.

    What do you mean with "artificial benchmarks"?, the thing I'm trying to test is how glusterfs scales, so write allot of files, very fast. Have a gigabit switch available, so bandwidth isn't an issue.

    Thanks for the help :)
  4. Crazy

    Crazy TS Rookie Topic Starter Posts: 139

    Ok, so i've made this:

    for ((  i = 0 ;  i <= 1000;  i++  )) do
      `dd if=/dev/zero of=/home/import/1Mfile$i bs=1M count=1`
    rm -f /home/import/*
    But how can I hide the output?, I execute the script with "time ./<script>".
    But how to suppress the output from 'dd'?, I tried adding '> /dev/null' but that didn't work.
    Any ideas?, I'm new at this kind of stuff :p

    * EDIT *

    Ok, nvm. Found it.
    `dd if=/dev/zero of=/home/import/1kfile$i bs=1KB count=1 2>/dev/null`
    Had to add a '2'. Typical, i searched/tried for a while, didn't find it, and the moment I ask for help I found the solution :p

    Have another question:
    How can I get the time inside a script?
    So something like this:
    for ((  i = 0 ;  i <= 1000;  i++  )) do
      `dd if=/dev/zero of=/home/import/1kfile$i bs=1KB count=1 2>/dev/null`
    echo TIME
    for ((  i = 0 ;  i <= 10;  i++  )) do
      `dd if=/dev/zero of=/home/import/1kfile$i bs=500KB count=1 2>/dev/null`
    echo TIME2
  5. jobeard

    jobeard TS Ambassador Posts: 11,158   +986

    a- all benchmarks are artificial as one can only document a controlled environment.
    b- your scripting is ok but it serialize the file i/o (just one kind of test)
    c- use an outer script to launch the existing script into background by appending '&' to invocation

    suggest bs=X count=Y be parameters so you can measure blocking factors as
    well as file sizes

    1. you need multiple file creation,
    2. multiple file reads (cat $fn >/dev/null)
    3. multiple file updates-inplace.


    some means to perform these actions at random locations.
    (3) above will assist here as the files are preexisting and thus move the HD arm
    to each as needed.

    (3) will require a program that fopens the file in mode 'r+'. This will
    cause existing sectors to be overwritten rather than deleting the old file on
    the fopen mode 'w' and then just recreating it.
  6. Nodsu

    Nodsu TS Rookie Posts: 5,837   +6

    Well, the purpose of your system is not to be filled with files from zero, is it?
    I would imagine the purpose of the system is to be full of files and handle operations on these files real fast.

    So your starting point would be to put a bunch of real-looking data there. Maybe copy it from an existing system, possibly obfuscating file contents and filenames for security reasons.
    After that you could simulate tons of simultenaous file accesses, writes, creations, deletions etc, whatever the real life situation is. After you are done with the first run, format the filesystem clean, copy the same starting point data again and start over.

    It's a less known fact.. You can do pretty much everything with for loops that you can do with normal commands.
    time for i in whatever; do
    time for j in somethingelse; do
Topic Status:
Not open for further replies.

Similar Topics

Add your comment to this article

You need to be a member to leave a comment. Join thousands of tech enthusiasts and participate.
TechSpot Account You may also...