Page tree


The goal was it to simplify the access to the User's data taken during beamtimes at DESY's FLASH accelerator facility through the use of Python as programming language of choice.

The given examples illustrate to access/evaluate the most common experimental parameters (e.g. GMDs, images, ADCs). An authentic HDF5 file of 1200 data points (2 minutes at FLASH's characteristic 10Hz)
is given exemplary to get familiar with the file structure and to encounter typical difficulties while evaluation large datasets.

Getting Started

Clone or Download (and desired examples) in your working folder or set a correct PYTHONPATH.

$ python3

or develop online via ipython

$ ipyhton3
In [1]: import flashh5

with hitting the TAB key after "flashh5.FLAShH5File." you get a list of FLASHH5's functions

 In [2]: flashh5.FlashH5File.<TAB>

and with "?" after the function's name you get information
describing the function's purpose and arguments.

In [3]: flashh5.FlashH5File.get_channel_data?
Type:        instancemethodString form: <unbound method FlashH5File.get_channel_data>
File:        /home/cpassow/PycharmProjects/flashh5/
Definition:  flashh5.FlashH5File.get_channel_data(self, channel_name)
Docstring:   Returns complete data from channel as numpy.ndarray

Simplified use as pythonic context manager via:

flash_h5 = flashh5.FlashH5File(h5_filename)
with flash_h5 as obj:
    ### Lists all main keys

List of methods


Lists all main keys e.g.

Keys: ['FL1', 'FL2', 'Timing']


Lists full tree of the h5 file e.g.

Timing/gmd/pulse ID
Timing/pulse ID
Timing/pulse time
Timing/repetition rate
Timing/set number of bunches
Timing/time stamp
Timing/time stamp/fl1user1
Timing/time stamp/gmd

get_number_of_events('FL1/Experiment/Camera/Focus microscope/image')

Lists the number of Events in given channel: default is trainID if not specified e.g.

Number of events in Timing/train ID : 1200
Number of events in FL1/Photon Diagnostic/GMD/Average energy/energy BDA (raw) : 1200


Prints and Returns the first and last trainID+1 from current file e.g.

Current file: first trainID = 1485596024 last trainID + 1 = 1485597224 #trains = 1200

get_time_interval(FIRST_TRAINID, LAST_TRAINID + 1)

Prints start and end time in UTC by given first and last trainID e.g.

Current file - UTC start time :
2017-09-16 09:34:50
Current file - UTC end time :
2017-09-16 09:36:49

get_train_ID_range(FIRST_TRAINID, LAST_TRAINID + 1)

Returns an numpy.ndarray with trainIDs and an numpy.ndarray with excludes from given first to given last trainID
(scans for zeros and NaNs in trainIDs of HDF5 file )


Returns complete data from channel as numpy.ndarray

get_channel_data_by_train_ID(CHANNEL_NAME, TRAINID_ARRAY, EXCLUDE_ARRAY = 'optional')

Returns an numpy.ndarray with channel data indexed by an array of trainIDs
Optional parameter: useful to pass exclude_array to speed up when train_array is continuous interval with just a few gaps.
(e.g. to exclude zeros and/or NaNs)

List of examples

Example shows standard outputs for hdf file via flashh5 (functions listed above).

Example plots UNIX timestamp vs trainID.

Example plots correlation between GMD-BDA and GMD-tunnel. Data from channel are not indexed by the train ID. Script fetch all data from channel and slice it afterwards. This is not recommended for images or ADC traces.

Different examples shows averaged camera image (all data of file, indexed via array of trainIDs or slicing of the array of trainIDs).

Example shows averaged ADC trace (1 sigma quantile via Gaussian fit) after sorting GMD values.


NOTE: Content below the STOPPUBLISH variable is not published. Use this space as a scrap area.

It was intended to make it easier ergo faster to access the HDF5 files with the FLASH specific structure for us and the Users. Comments and suggestions are invited.
Feel free to add content and examples.

November 2018 - Update: 1.) change to Erlands new HDF tree 2.) change methods to work with context manager

  • No labels