Learning GEOM: Tasting
Category: FreeBSD.
I have been quite busy these last days and didn't got as much time as expected to have fun with geom(4) — the FreeBSD modular disk transformation framework. Since documentation is quite sparse compared to the rocket science inside GEOM, I spend a lot of time in man pages and reading existing GEOM classes source code... However, as recently said on the freebsd-geom mailing list:
Ahh you're believing the code comment :( (Rule #1: Never believe the code comments).
Well, this doesn't make things easy, but I can cope with that (or at least I hope so). I have already located interesting classes to study for the purpose of geom_lvm2:
- geom/geom_bsd.c
- This class looks relatively simple and is well commented. It seems to be a good starting point for learning how GEOM works (687 SLOC);
- geom/concat/g_concat.c
- This class seems a little more complex. Concatenation will have to be handled when a logical volume has physical extends on different physical volumes. Maybe will it be possible to rely directly on this class for this purpose (876 SLOC);
- geom/vinum/*
- Gvinum is a really complex^Wcool piece of code. Since it internally relies on various representation of disk chunks to provide services such as RAID, it can be a good source of inspiration for geom_lvm2's internals (7637 SLOC).
Tasting experimentations
When not reading existing code, I try to implement basic GEOM classes that provide simple facilities. For the moment, I am focusing on tasting. Basically, tasting occurs when a new GEOM class or provider is created (e.g. when a geom kernel module is loaded or when a new device is connected to the system). The GEOM class will test the device to see if it can handle it (generally looking for magic data).
Declaring a GEOM class with tasting capabilities
Declaring a GEOM class that can perform tasting operations is as simple as setting the .taste
member of the data structure passed to the DECLARE_GEOM_CASS()
macro:
#define TASTE_CLASS_NAME "TASTE" static struct g_class g_taste_class = { .name = TASTE_CLASS_NAME, .version = G_VERSION, .taste = g_taste_taste, .destroy_geom = g_taste_destroy_geom, }; DECLARE_GEOM_CLASS(g_taste_class, g_taste);
The g_taste_taste()
function has to be declared as:
static struct g_geom * g_taste_taste(struct g_class *mp, struct g_provider *pp, int flags __unused)
Reading tasted device
In order to read data on the device being tested, we first have to attach a consumer to it.
This is a 3 step operation: create an instance of the geom class ①, add a consumer on it ②, and attach it to the device provider ③:
static struct g_geom * g_taste_taste(struct g_class *mp, struct g_provider *pp, int flags __unused) { struct g_consumer *cp; struct g_geom *gp; g_trace(G_T_TOPOLOGY, "%s(%s,%s)", __func__, mp->name, pp->name); g_topology_assert(); gp = g_new_geomf(mp, "taste:taste"); ① cp = g_new_consumer(gp); ② g_attach(cp, pp); ③ [...]
It is then possible to read data from the disk. One more time, this is a bit tricker then in userland: we first have to gain read access ①, unlock the topology lock ②, read what we want ③, re-lock the topology lock ④, and drop off read access ⑤:
[...] int error; u_char *buf; error = g_access(cp, 1, 0, 0); ① if (error) return(NULL); g_topology_unlock(); ② buf = g_read_data(cp, 0x200, pp->sectorsize, &error); ③ g_topology_lock(); ④ g_access(cp, -1, 0, 0); ⑤ if (buf == NULL) return(NULL); [...]
When done, we have to release all allocated resources:
[...] g_detach(cp); g_destroy_consumer(cp); g_destroy_geom(gp); return(NULL); }
Not so obvious, is it?
Real life example
I have started to write a taste GEOM class that look for LVM2 physical volumes. Here is what is appends to dmesg(8) when I kldload(8) it (with the sysctl kern.geom.debugflags=1):
g_modevent(TASTE, LOAD) g_post_event_x(0xc0640bac, 0xc1a3fa00, 2, 262144) g_load_class(TASTE) g_taste_taste(TASTE,ad0s1) g_detach(0xc1a8a3c0) g_destroy_consumer(0xc1a8a3c0) g_destroy_geom(0xc1afc900(taste:taste)) g_taste_taste(TASTE,ad0s1f) g_detach(0xc1aee480) g_destroy_consumer(0xc1aee480) g_destroy_geom(0xc1d33700(taste:taste)) g_taste_taste(TASTE,ad0s1e) g_detach(0xc1aee580) g_destroy_consumer(0xc1aee580) g_destroy_geom(0xc1d33480(taste:taste)) g_taste_taste(TASTE,ad0s1d) g_detach(0xc1aee600) g_destroy_consumer(0xc1aee600) g_destroy_geom(0xc1aa8080(taste:taste)) g_taste_taste(TASTE,ad0s1c) g_detach(0xc1aeec00) g_destroy_consumer(0xc1aeec00) g_destroy_geom(0xc1d33100(taste:taste)) g_taste_taste(TASTE,ad0s1b) g_detach(0xc1a8a340) g_destroy_consumer(0xc1a8a340) g_destroy_geom(0xc1d32e00(taste:taste)) g_taste_taste(TASTE,ad0s1a) g_detach(0xc1aef300) g_destroy_consumer(0xc1aef300) g_destroy_geom(0xc1d31200(taste:taste)) g_taste_taste(TASTE,ad2) ===> Found LVM2 physical device ad2 0000 4c 41 42 45 4c 4f 4e 45 01 00 00 00 00 00 00 00 |LABELONE........| 0010 f6 30 e2 a1 20 00 00 00 4c 56 4d 32 20 30 30 31 |.0.. ...LVM2 001| ---> Physical Volume ID: AsW40RRq074Ac0P03XLZp1IJfrg1DPbq 0000 41 73 57 34 30 52 52 71 30 37 34 41 63 30 50 30 |AsW40RRq074Ac0P0| 0010 33 58 4c 5a 70 31 49 4a 66 72 67 31 44 50 62 71 |3XLZp1IJfrg1DPbq| 0020 00 40 32 0f 00 00 00 00 00 00 03 00 00 00 00 00 |.@2.............| 0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0040 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 |................| 0050 00 f0 02 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0070 00 00 00 00 00 00 00 00 |........ | ---> Dumping memory at 0x1000 0000 7c 99 ff fc 20 4c 56 4d 32 20 78 5b 35 41 25 72 ||... LVM2 x[5A%r| 0010 30 4e 2a 3e 01 00 00 00 00 10 00 00 00 00 00 00 |0N*>............| 0020 00 f0 02 00 00 00 00 00 00 08 00 00 00 00 00 00 |................| 0030 17 07 00 00 00 00 00 00 d9 35 97 54 00 00 00 00 |.........5.T....| 0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| ---> Found location: 0000 00 08 00 00 00 00 00 00 17 07 00 00 00 00 00 00 |................| 0010 d9 35 97 54 00 00 00 00 |.5.T.... | g_detach(0xc1a8a240) g_destroy_consumer(0xc1a8a240) g_destroy_geom(0xc1ad7200(taste:taste)) g_taste_taste(TASTE,ad1) ===> Found LVM2 physical device ad1 0000 4c 41 42 45 4c 4f 4e 45 01 00 00 00 00 00 00 00 |LABELONE........| 0010 1c 08 45 cb 20 00 00 00 4c 56 4d 32 20 30 30 31 |..E. ...LVM2 001| ---> Physical Volume ID: sGOnNc3VOhRU7iFVkc335jcyrPLVf3Pl 0000 73 47 4f 6e 4e 63 33 56 4f 68 52 55 37 69 46 56 |sGOnNc3VOhRU7iFV| 0010 6b 63 33 33 35 6a 63 79 72 50 4c 56 66 33 50 6c |kc335jcyrPLVf3Pl| 0020 00 44 2d 10 00 00 00 00 00 00 03 00 00 00 00 00 |.D-.............| 0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0040 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 |................| 0050 00 f0 02 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0070 00 00 00 00 00 00 00 00 |........ | ---> Dumping memory at 0x1000 0000 06 83 18 71 20 4c 56 4d 32 20 78 5b 35 41 25 72 |...q LVM2 x[5A%r| 0010 30 4e 2a 3e 01 00 00 00 00 10 00 00 00 00 00 00 |0N*>............| 0020 00 f0 02 00 00 00 00 00 00 28 00 00 00 00 00 00 |.........(......| 0030 17 07 00 00 00 00 00 00 d9 35 97 54 00 00 00 00 |.........5.T....| 0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| ---> Found location: 0000 00 28 00 00 00 00 00 00 17 07 00 00 00 00 00 00 |.(..............| 0010 d9 35 97 54 00 00 00 00 |.5.T.... | g_detach(0xc1aef2c0) g_destroy_consumer(0xc1aef2c0) g_destroy_geom(0xc1ad6a80(taste:taste)) g_taste_taste(TASTE,ad0) g_detach(0xc1a8a4c0) g_destroy_consumer(0xc1a8a4c0) g_destroy_geom(0xc1d33380(taste:taste))
Please note that I have deliberately taken forbidden shortcuts for reading on-disk structures: this is heavily relying on modified LVM2 GPL data structures (those that will be re-written) — I preferred focusing on testing rather than on writing robust one-time-use code. This explains why some redundancy can be seen on data structures dumps (generated using hexdump(9)).
Comments
On August 4, 2008, Tony Shadwick wrote:
It's stuff like this that makes me wish that I knew a language better than Perl. I so very much want LVM2 under FreeBSD, but I could never code it myself. :(
ATA over Ethernet pretty much winds up *demanding* LVM - it's not that you can't go without it, but the very purpose of being able to attach additional storage on a layer 2 ethernet interface is so that you can easily scale and grow volumes, and vinum just doesn't cut it. I've tried - believe me. I've wanted many times to like vinum. Instead, I've learned to hate it and avoid it like the plague. LVM2 however has always been a pleasure to work with. My hat is off to you and your work. If there is ever any way I can help just let me know. I want to keep using FreeBSD as my primary server OS, but the lack of support for LVM2 and OpenAFS may wind up driving me to Linux. :(