I was chuffed to see Kevin Ford report that the Unix utility file
now recognizes MARC records:
$ file 101015_001.mp3 101015_001.mp3: Audio file with ID3 version 2.3.0, contains: MPEG ADTS, layer III, v1, 192 kbps, 44.1 kHz, Stereo $ file my-cats.jpg my-cats.jpg: JPEG image data, JFIF standard 1.02 $ file OL.20100104.01.mrc OL.20100104.01: MARC21 Bibliographic
If you download the source and look at the magic/Magdir/marc21
file you’ll see what makes it work. Every file type has some “magic” that lets you identify it:
#-------------------------------------------- # marc21: file(1) magic for MARC 21 Format # # Kevin Ford (kefo@loc.gov) # # MARC21 formats are for the representation and communication # of bibliographic and related information in machine-readable # form. For more info, see http://www.loc.gov/marc/ # leader position 20-21 must be 45 20 string 45 # leader starts with 5 digits, followed by codes specific to MARC format >0 regex/1 (^[0-9]{5})[acdnp][^bhlnqsu-z] MARC21 Bibliographic !:mime application/marc >0 regex/1 (^[0-9]{5})[acdnosx][z] MARC21 Authority !:mime application/marc >0 regex/1 (^[0-9]{5})[cdn][uvxy] MARC21 Holdings !:mime application/marc 0 regex/1 (^[0-9]{5})[acdn][w] MARC21 Classification !:mime application/marc >0 regex/1 (^[0-9]{5})[cdn][q] MARC21 Community !:mime application/marc # leader position 22-23, should be "00" but is it? >0 regex/1 (^.{21})([^0]{2}) (non-conforming) !:mime application/marc
A small victory now that a basic Unix/Linux utility can recognize a key library file format, but as Kyle Bannerjee put it on the Code4Lib mailing list, “I’m not sure whether to laugh or cry that it’s a sign of progress that a 40 year old utility designed to identify file types is now just beginning to be able to recognize a format that’s been around for almost 50 years.”