August 28, 2007

Learning GEOM: Tasting

Category: FreeBSD.

I have been quite busy these last days and didn't got as much time as expected to have fun with geom(4) — the FreeBSD modular disk transformation framework. Since documentation is quite sparse compared to the rocket science inside GEOM, I spend a lot of time in man pages and reading existing GEOM classes source code... However, as recently said on the freebsd-geom mailing list:

Ahh you're believing the code comment :( (Rule #1: Never believe the code comments).

Well, this doesn't make things easy, but I can cope with that (or at least I hope so). I have already located interesting classes to study for the purpose of geom_lvm2:

geom/geom_bsd.c
This class looks relatively simple and is well commented. It seems to be a good starting point for learning how GEOM works (687 SLOC);
geom/concat/g_concat.c
This class seems a little more complex. Concatenation will have to be handled when a logical volume has physical extends on different physical volumes. Maybe will it be possible to rely directly on this class for this purpose (876 SLOC);
geom/vinum/*
Gvinum is a really complex^Wcool piece of code. Since it internally relies on various representation of disk chunks to provide services such as RAID, it can be a good source of inspiration for geom_lvm2's internals (7637 SLOC).

Tasting experimentations

When not reading existing code, I try to implement basic GEOM classes that provide simple facilities. For the moment, I am focusing on tasting. Basically, tasting occurs when a new GEOM class or provider is created (e.g. when a geom kernel module is loaded or when a new device is connected to the system). The GEOM class will test the device to see if it can handle it (generally looking for magic data).

Declaring a GEOM class with tasting capabilities

Declaring a GEOM class that can perform tasting operations is as simple as setting the .taste member of the data structure passed to the DECLARE_GEOM_CASS() macro:

#define TASTE_CLASS_NAME "TASTE"

static struct g_class g_taste_class =
{
  .name         = TASTE_CLASS_NAME,
  .version      = G_VERSION,
  .taste        = g_taste_taste,
  .destroy_geom = g_taste_destroy_geom,
};

DECLARE_GEOM_CLASS(g_taste_class, g_taste);

The g_taste_taste() function has to be declared as:

static struct g_geom *
g_taste_taste(struct g_class *mp, struct g_provider *pp,
    int flags __unused)

Reading tasted device

In order to read data on the device being tested, we first have to attach a consumer to it.

This is a 3 step operation: create an instance of the geom class ①, add a consumer on it ②, and attach it to the device provider ③:

static struct g_geom *
g_taste_taste(struct g_class *mp, struct g_provider *pp,
    int flags __unused)
{
  struct g_consumer *cp;
  struct g_geom *gp;

  g_trace(G_T_TOPOLOGY, "%s(%s,%s)", __func__, mp->name, pp->name);
  g_topology_assert();

  gp = g_new_geomf(mp, "taste:taste"); ①
  cp = g_new_consumer(gp); ②
  g_attach(cp, pp); ③

[...]

It is then possible to read data from the disk. One more time, this is a bit tricker then in userland: we first have to gain read access ①, unlock the topology lock ②, read what we want ③, re-lock the topology lock ④, and drop off read access ⑤:

[...]

  int error;
  u_char *buf;

  error = g_access(cp, 1, 0, 0); ①
  if (error)
    return(NULL);
  g_topology_unlock(); ②
  buf = g_read_data(cp, 0x200, pp->sectorsize, &error); ③
  g_topology_lock(); ④
  g_access(cp, -1, 0, 0); ⑤
  if (buf == NULL)
    return(NULL);

[...]

When done, we have to release all allocated resources:

[...]

  g_detach(cp);
  g_destroy_consumer(cp);
  g_destroy_geom(gp);

  return(NULL);
}

Not so obvious, is it?

Real life example

I have started to write a taste GEOM class that look for LVM2 physical volumes. Here is what is appends to dmesg(8) when I kldload(8) it (with the sysctl kern.geom.debugflags=1):

g_modevent(TASTE, LOAD)
g_post_event_x(0xc0640bac, 0xc1a3fa00, 2, 262144)
g_load_class(TASTE)
g_taste_taste(TASTE,ad0s1)
g_detach(0xc1a8a3c0)
g_destroy_consumer(0xc1a8a3c0)
g_destroy_geom(0xc1afc900(taste:taste))
g_taste_taste(TASTE,ad0s1f)
g_detach(0xc1aee480)
g_destroy_consumer(0xc1aee480)
g_destroy_geom(0xc1d33700(taste:taste))
g_taste_taste(TASTE,ad0s1e)
g_detach(0xc1aee580)
g_destroy_consumer(0xc1aee580)
g_destroy_geom(0xc1d33480(taste:taste))
g_taste_taste(TASTE,ad0s1d)
g_detach(0xc1aee600)
g_destroy_consumer(0xc1aee600)
g_destroy_geom(0xc1aa8080(taste:taste))
g_taste_taste(TASTE,ad0s1c)
g_detach(0xc1aeec00)
g_destroy_consumer(0xc1aeec00)
g_destroy_geom(0xc1d33100(taste:taste))
g_taste_taste(TASTE,ad0s1b)
g_detach(0xc1a8a340)
g_destroy_consumer(0xc1a8a340)
g_destroy_geom(0xc1d32e00(taste:taste))
g_taste_taste(TASTE,ad0s1a)
g_detach(0xc1aef300)
g_destroy_consumer(0xc1aef300)
g_destroy_geom(0xc1d31200(taste:taste))
g_taste_taste(TASTE,ad2)
===> Found LVM2 physical device ad2
0000   4c 41 42 45 4c 4f 4e 45 01 00 00 00 00 00 00 00  |LABELONE........|
0010   f6 30 e2 a1 20 00 00 00 4c 56 4d 32 20 30 30 31  |.0.. ...LVM2 001|
---> Physical Volume ID: AsW40RRq074Ac0P03XLZp1IJfrg1DPbq
0000   41 73 57 34 30 52 52 71 30 37 34 41 63 30 50 30  |AsW40RRq074Ac0P0|
0010   33 58 4c 5a 70 31 49 4a 66 72 67 31 44 50 62 71  |3XLZp1IJfrg1DPbq|
0020   00 40 32 0f 00 00 00 00 00 00 03 00 00 00 00 00  |.@2.............|
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0040   00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00  |................|
0050   00 f0 02 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0070   00 00 00 00 00 00 00 00                          |........        |
---> Dumping memory at 0x1000
0000   7c 99 ff fc 20 4c 56 4d 32 20 78 5b 35 41 25 72  ||... LVM2 x[5A%r|
0010   30 4e 2a 3e 01 00 00 00 00 10 00 00 00 00 00 00  |0N*>............|
0020   00 f0 02 00 00 00 00 00 00 08 00 00 00 00 00 00  |................|
0030   17 07 00 00 00 00 00 00 d9 35 97 54 00 00 00 00  |.........5.T....|
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0070   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0080   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0090   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
---> Found location:
0000   00 08 00 00 00 00 00 00 17 07 00 00 00 00 00 00  |................|
0010   d9 35 97 54 00 00 00 00                          |.5.T....        |
g_detach(0xc1a8a240)
g_destroy_consumer(0xc1a8a240)
g_destroy_geom(0xc1ad7200(taste:taste))
g_taste_taste(TASTE,ad1)
===> Found LVM2 physical device ad1
0000   4c 41 42 45 4c 4f 4e 45 01 00 00 00 00 00 00 00  |LABELONE........|
0010   1c 08 45 cb 20 00 00 00 4c 56 4d 32 20 30 30 31  |..E. ...LVM2 001|
---> Physical Volume ID: sGOnNc3VOhRU7iFVkc335jcyrPLVf3Pl
0000   73 47 4f 6e 4e 63 33 56 4f 68 52 55 37 69 46 56  |sGOnNc3VOhRU7iFV|
0010   6b 63 33 33 35 6a 63 79 72 50 4c 56 66 33 50 6c  |kc335jcyrPLVf3Pl|
0020   00 44 2d 10 00 00 00 00 00 00 03 00 00 00 00 00  |.D-.............|
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0040   00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00  |................|
0050   00 f0 02 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0070   00 00 00 00 00 00 00 00                          |........        |
---> Dumping memory at 0x1000
0000   06 83 18 71 20 4c 56 4d 32 20 78 5b 35 41 25 72  |...q LVM2 x[5A%r|
0010   30 4e 2a 3e 01 00 00 00 00 10 00 00 00 00 00 00  |0N*>............|
0020   00 f0 02 00 00 00 00 00 00 28 00 00 00 00 00 00  |.........(......|
0030   17 07 00 00 00 00 00 00 d9 35 97 54 00 00 00 00  |.........5.T....|
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0070   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0080   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0090   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
---> Found location:
0000   00 28 00 00 00 00 00 00 17 07 00 00 00 00 00 00  |.(..............|
0010   d9 35 97 54 00 00 00 00                          |.5.T....        |
g_detach(0xc1aef2c0)
g_destroy_consumer(0xc1aef2c0)
g_destroy_geom(0xc1ad6a80(taste:taste))
g_taste_taste(TASTE,ad0)
g_detach(0xc1a8a4c0)
g_destroy_consumer(0xc1a8a4c0)
g_destroy_geom(0xc1d33380(taste:taste))

Please note that I have deliberately taken forbidden shortcuts for reading on-disk structures: this is heavily relying on modified LVM2 GPL data structures (those that will be re-written) — I preferred focusing on testing rather than on writing robust one-time-use code. This explains why some redundancy can be seen on data structures dumps (generated using hexdump(9)).

Tags:

Comments

On August 4, 2008, Tony Shadwick wrote:

It's stuff like this that makes me wish that I knew a language better than Perl. I so very much want LVM2 under FreeBSD, but I could never code it myself. :(

ATA over Ethernet pretty much winds up *demanding* LVM - it's not that you can't go without it, but the very purpose of being able to attach additional storage on a layer 2 ethernet interface is so that you can easily scale and grow volumes, and vinum just doesn't cut it. I've tried - believe me. I've wanted many times to like vinum. Instead, I've learned to hate it and avoid it like the plague. LVM2 however has always been a pleasure to work with. My hat is off to you and your work. If there is ever any way I can help just let me know. I want to keep using FreeBSD as my primary server OS, but the lack of support for LVM2 and OpenAFS may wind up driving me to Linux. :(

Comments RSS feed | Leave a Reply…

top