C
Efficient strlen(3) implementation in C (part 2)
Some times ago, I wrote [1] a few words about strlen(3) implementation and how inefficient it is on most systems. Although optimizing this function is utterly not the best way to make your program run faster (refer to Reducing complexity
in the previous strlen(3) post), it is interesting to digg deeper in how modern CPUs can help us to compute a string length faster.
- Romain Tartière's blog
- Login to post comments
- Read more
Learning GEOM: Tasting
I have been quite busy these last days and didn't got as much time as expected to have fun with geom(4) — the FreeBSD modular disk transformation framework. Since documentation is quite sparse compared to the rocket science inside GEOM, I spend a lot of time in man pages and reading existing GEOM classes source code... However, as recently said on the freebsd-geom mailing list:
Ahh you're believing the code comment :( (Rule #1: Never believe the code comments).
Well, this doesn't make things easy, but I can cope with that (or at least I hope so). I have already located interesting classes to study for the purpose of geom_lvm2:
- geom/geom_bsd.c
- This class looks relatively simple and is well commented. It seems to be a good starting point for learning how GEOM works (687 SLOC);
- geom/concat/g_concat.c
- This class seems a little more complex. Concatenation will have to be handled when a logical volume has physical extends on different physical volumes. Maybe will it be possible to rely directly on this class for this purpose (876 SLOC);
- geom/vinum/*
- Gvinum is a really complex^Wcool piece of code. Since it internally relies on various representation of disk chunks to provide services such as RAID, it can be a good source of inspiration for geom_lvm2's internals (7637 SLOC).
- Romain Tartière's blog
- Login to post comments
- Read more
Efficient strlen(3) implementation in C
Yesterday, somebody reached my website googling for efficient strlen implementation
. This remind me an old discussion on code optimisation and how it was possible to dramatically reduce execution time of some basic functions.
Common implementation
Let's see how strlen(3) is implemented under FreeBSD 6.2 (the system I'm running). Chances are an equivalent implementation exists for nearly every operating system):
size_t
strlen(str)
const char *str;
{
const char *s;
for (s = str; *s; ++s);
return(s - str);
}
The design is simple: a pointer (s) runs along the given string (str) looking for an end-of-string character ('\0' == *s)
. The string length is then computed by substracting the address of the pointer (s) from the address of the beginning of the string
. While simple and portable, this implementation is not efficient: each loop iteration only performs a simple operation and the breaking condition is based on its result. Consequently, no pipeline effect occurs in the CPU.
- Romain Tartière's blog
- Login to post comments
- Read more
Dumping LVM2 logical volumes under FreeBSD
At work, I generally use the Debian GNU/Linux operating system: as all our servers run Debian, it avoids loads of clashes when we put in production our work. But I am frequently complaining about how Debian works and globally can't bear it's frustrating package management system (which becomes even worth than ever when you run a advanced-user / developer / desktop machine
). Anyway, there are many situations I could do my job with another operating system that fits much more my needs, let's say FreeBSD... The problem is that as a paranoid sysadmin, my current /home directory is a LVM2 logical volume on top of software RAID controlled by mdadm.
FreeBSD features a cool abstraction layer in the kernel called GEOM and documented as a modular disk I/O request transformation framework
. Basically, it fits between your devices as represented by files in /dev and the physical disks, providing on the fly mirroring / stripping / whatever, without the need of any external tool (it's a bit more complex in fact, but keep it simple
).
Unfortunately, all this is uterly not compatible with either MD or LVM2.
Efficient tests with optional diagnostic in C
I love C because it is a low level high level programming language.
I hate C for exactly the same reason.
If you have already wrote a C program that rely on a lot of system calls and tests to check the execution environment before doing something, you have probably entered the hell of verifying loads of return values and displaying diagnostic messages for each failure case.
- Romain Tartière's blog
- Login to post comments
- Read more
Incremental backup from shell scripts
In the past days, I have been working on an incremental backup solution for my company dedicated servers. These servers hosts gigabytes of data spread between websites, documents repositories and tools.
For each of our machine, a complete backup consists in two actions:
- a local copy with appropriate privileges of everything that cannot be safely hot-copied to a remote location (such as database clusters that have to be dumped to SQL files and system files that need root privileges to be read);
- directories to save on the host are copied to a backup server.
The first part is performed by Backupninja, a very flexible tool that rely on many other programs to do its job. I particularly like it since it goes in the UNIX philosophy: do one thing and do it well
. Basically, this first part was trivial to set up.
- Romain Tartière's blog
- Login to post comments
- Read more