the making of: LogBooks

JADE LogBooks (scanned versions)

Hamburg, February 15, 2011

"how-to" documentation of digitizing JADE Log Books

S. Bethke and J. Olsson

A. photographing of Log Books

The 21 Log Books of JADE consist of about 100 double-pages each. The size of the double-page is about 70 x 35 cm. The double-pages where photographed using a Nikon D7000 digital SLR (16.2 MPix) with AF-S Nikkor 10-24 mm 1:2.5-4.5 wide-angle zoom lens. The camera was positioned ca. 40 cm above the table plane, held by a tripod, horizontally fixed to a metal frame. The wide-angle lens had two advantages:
1. larger depth of focus, and
2. smaller (as is) distance to object

No glass plate or other fixures to keep the pages planar were used, as a glass plate created lots of problems with reflections from various desktop lamps used to create an acceptable degree of homogenuous illumination. The wide-angle lens with deep depth of focus helped to digest the unavoidable bending and warpage of pages without need for further mechanical fixation.
The photos were taken, alternately, on two 8 GB memory cards (SD HC); a second battery for the camera helped to avoid dead times due to battery charging (no direct powering of camera was possible) and readout of cards (which was done using an external multi-card reader). An infrared activator helped to avoid mechanical agitation. All these details were essential in keeping the overall photo shooting and readout time to a minimum (at the end, abt 1.5 days in total, for 2 people, allowing parallel processing of files).
Raw as well as jpg compressed files were stored for each photo (about 20 MB for the raw and 5 MB for the jpg file each). The files were read out onto four 16 GB memory sticks, 5-6 Log Books on each stick. Copies of these files were also stored onto an external 1 TB hard drive. Redout and further processing of these files was done using an 11" MacBook Air (1.6 GHz Intel Core 2 Duo, 4 GB of ram, 128 GB SSD).

B. processing of files using Adobe Photoshop (PS) CS4 vs. 10.0.2:

While the raw data files were left unchanged (for possible future use, e.g. for professional white balance), the jpg files provided by the camera (abt 5 MB each) were further processed for
1. colour adjustment and optimisation
2. rotation by 180 degrees
3. cropping
4. saving to a new location on the MacBook SSD.
This was done in the following steps (using a German version of PS):

- for each Log Book, make duplicates of at least two of the ca. 100 jpg files, and use them to set up the action procedure and to test uniformity w.r.t. constant cropping boundaries (take 1 file at beginning and at end of each Log Book)
- open the two files in PS
- in window "Aktionen", disable previous actuon lists (unclick check sign at beginning of each item line) (if window "Aktionen" is not displayed, you may open it clicking "Fenster" -> "Aktionen") - in window "Aktionen", open a "neue Aktion" (click respective symbol in bottom line) and name it suitably. the "record" sign (red dot) will lighten (symbol in bottom line), meaning that all actions on the displayed file will now be recorded.
- with test file open and displayed, execute activities you wish to automise, i.e., according to the list 1.-4. above:
• "Bild" -> "Auto-Farbton"
• "Bild" -> "Bild-Drehung" -> "180 Grad"
• select "Auswahlbereich" for cropping using Auswahlwerkzeug
• "Bild" -> "Freistellen"
• "Datei" -> "speichern unter" -> select/create target-folder
• "close" current window (important, otherwise windows will pile up in automatic batch mode)
• "stop" recording in action-window (button in botton line, red recording
sign turns black) (may have to delete last action line -"schliessen" if it appears twice)
- open 2nd test file, select desired "action" line in action window
- press "run" button in bottom line (-> execute action list)
- reopen last file (it was "close"d in action list), check if o.k. (i.e., if same selection of cropping area apllied o.k. etc.)

If both test files o.k.:
- delete those test files in both source and target folders
- in PS, initiate batch operation: "Datei" -> "Automatisieren" -> "Stapel- verarbeitung" (i.e. batch processing)
- in pop-up window, select source and target folders; press "ok" to start batch processing
- may watch single files being opened and processed.

In our cases, the average jpg file size for each photo was reduced from 5 MB to abt 3 MB (having defined "level 10" quality when setting up).

C. making pdf files (one each for each LogBook)

For this step we used Adobe Acrobat 9 pro (German version) and its batch (Stapel) mode:
- "Datei" -> "pdf erstellen" -> "Stapelerstellung von mehreren pdf Dateien"
- choose source folder with sourde jpg files ("ordner hinzufügen"), click "ok"
- in next pop-up-window ("Ausgabeoptionen"), select "spezifischer Ordner" and choose/create target folder; click "ok"

all this creates pdf files out of the jpg files (with similar size each) In order to concatenate these pdf files to a single one:
- "Datei" -> "zusammenführen" -> "Dateien in einen einzigen pdf ...")
- choose source folder with pdf files ("Ordner hinzufügen")
This results in a single pdf file, in our case abt 300 MB (100 times 3 MB). For an additional higher compression version:

- "Dokument" -> "Dateigröße verringern" -> "ok"
- choose target location
This resulted in average file sizes of only abt 45 MB, for each file containing the abt 100 double-pages of one Log Book -- a factor of 50 reduction w.r.t. the raw photo files, with only moderate reduction in visible resolution.

Summary of average file sizes:
• raw file from camera: 20 MB -> abt 40 GB total
• jpg file from camera: 5 MB -> abt 10 GB total
• jpg after cropping etc: 3 MB -> abt 6 GB total
• pdf after conversion: 3 MB -> abt 6 GB total
• pdf for 1 book: 300 MB -> abt 6 GB total
• pdf for 1 book reduced: 45 MB -> abt 1 GB total

meaning that a total of about 70 GB is needed if all files are kept. With exception of the unconcatenated pdf files, indeed we keep all files stored at various places (J.O.'s 1 TB disk, S.B.'s laptops plus various other machines, server at MPP).

...and here is a list of contents:


01 10.03.79-09.08.79 0-1337
02 09.08.79-01.03.80 1338-2841
03 01.03.80-01.07.80 2847-3959
04 01.07.80-08.10.80 3960-5201
05 08.10.80-30.03.81 5202-6885
06 30.03.81-29.07.81 6889-8289
07 30.07.81-09.12.81 8290-9648
08 09.12.81-18.05.82 9648-11269
09 18.05.82-13.11.82 11270-12572
10 13.11.82-13.06.82 12593-13706
11 13.06.83-27.10.83 13707-15050
12 27.10.83-09.03.84 15051-16287
13 09.03.84-30.07.84 16288-17542
14 30.07.84-30.10.84 17543-18897
15 30.10.84-31.05.85 18898-21041
16 31.05.85-31.08.85 21042-22501
17 31.08.85-22.02.86 22502-24281
18 22.02.86-09.05.86 24282-25904
19 09.05.86-23.07.86 25905-27657
20 25.07.86-06.10.86 27658-29616
21 07.10.86-03.11.86 29617-30403

