UnROOT.jl is a reader for the CERN ROOT file format written entirely in Julia, without any dependence on ROOT or Python.
Installation Guide
- Download the latest Julia release
- Open up Julia REPL (hit
]
once to enter Pkg mode, hit backspace to exit it)
julia>]
(v1.8) pkg> add UnROOT
docs for more)
Quick Start (seejulia> using UnROOT
julia> f = ROOTFile("test/samples/NanoAODv5_sample.root")
ROOTFile with 2 entries and 21 streamers.
test/samples/NanoAODv5_sample.root
└─ Events
├─ "run"
├─ "luminosityBlock"
├─ "event"
├─ "HTXS_Higgs_pt"
├─ "HTXS_Higgs_y"
└─ "⋮"
julia> mytree = LazyTree(f, "Events", ["Electron_dxy", "nMuon", r"Muon_(pt|eta)$"])
Row │ Electron_dxy nMuon Muon_eta Muon_pt
│ Vector{Float32} UInt32 Vector{Float32} Vector{Float32}
─────┼───────────────────────────────────────────────────────────
1 │ [0.000371] 0 [] []
2 │ [-0.00982] 2 [0.53, 0.229] [19.9, 15.3]
3 │ [] 0 [] []
4 │ [-0.00157] 0 [] []
⋮ │ ⋮ ⋮ ⋮ ⋮
You can iterate through a LazyTree
:
julia> for event in mytree
@show event.Electron_dxy
break
end
event.Electron_dxy = Float32[0.00037050247]
julia> Threads.@threads for event in mytree # multi-threading
...
end
Only one basket per branch will be cached so you don't have to worry about running out of RAM.
At the same time, event
inside the for-loop is not materialized until a field is accessed. If your event
is fairly small or you need all of them anyway, you can collect(event)
first inside the loop.
XRootD is also supported, depending on the protocol:
- the "url" has to start with
http://
orhttps://
: - (1.6+ only) or the "url" has to start with
root://
and have another//
to separate server and file path
julia> r = @time ROOTFile("https://scikit-hep.org/uproot3/examples/Zmumu.root")
0.034877 seconds (5.13 k allocations: 533.125 KiB)
ROOTFile with 1 entry and 18 streamers.
julia> r = ROOTFile("root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleMuParked.root")
ROOTFile with 1 entry and 19 streamers.
Branch of custom struct
We provide an experimental interface for hooking up UnROOT with your custom types
that only takes 2 steps, as explained in the docs.
As a show case for this functionality, the TLorentzVector
support in UnROOT is implemented
with the said plug-in system.
Support & Contributiing
- Use Github issues for any bug reporting or feature request; feel free to make PRs, bug fixing, feature tuning, quality of life, docs, examples etc.
- See
CONTRIBUTING.md
for more information and recommended workflows in contributing to this package.
TODOs
- Parsing the file header
- Read the
TKey
s of the top level dictionary - Reading the available trees
- Reading the available streamers
- Reading a simple dataset with primitive streamers
- Reading of raw basket bytes for debugging
- Automatically generate streamer logic
- Prettier
show
forLazy*
s - Clean up
Cursor
use - Reading
TNtuple
#27 - Reading histograms (
TH1D
,TH1F
,TH2D
,TH2F
, etc.) #48 - Clean up the
readtype
,unpack
,stream!
andreadobjany
construct - Refactor the code and add more docs
- Class name detection of sub-branches
- High-level histogram interface
Acknowledgements
Special thanks to Jim Pivarski (@jpivarski) from the Scikit-HEP project, who is the main author of uproot, a native Python library to read and write ROOT files, which was and is a great source of inspiration and information for reverse engineering the ROOT binary structures.
Behind the scene
Some additional debug output:
julia> using UnROOT
julia> f = ROOTFile("test/samples/tree_with_histos.root")
Compressed stream at 1509
ROOTFile("test/samples/tree_with_histos.root") with 1 entry and 4 streamers.
julia> keys(f)
1-element Array{String,1}:
"t1"
julia> keys(f["t1"])
Compressed datastream of 1317 bytes at 1509 (TKey 't1' (TTree))
2-element Array{String,1}:
"mynum"
"myval"
julia> f["t1"]["mynum"]
Compressed datastream of 1317 bytes at 6180 (TKey 't1' (TTree))
UnROOT.TBranch
cursor: UnROOT.Cursor
fName: String "mynum"
fTitle: String "mynum/I"
fFillColor: Int16 0
fFillStyle: Int16 1001
fCompress: Int32 101
fBasketSize: Int32 32000
fEntryOffsetLen: Int32 0
fWriteBasket: Int32 1
fEntryNumber: Int64 25
fIOFeatures: UnROOT.ROOT_3a3a_TIOFeatures
fOffset: Int32 0
fMaxBaskets: UInt32 0x0000000a
fSplitLevel: Int32 0
fEntries: Int64 25
fFirstEntry: Int64 0
fTotBytes: Int64 170
fZipBytes: Int64 116
fBranches: UnROOT.TObjArray
fLeaves: UnROOT.TObjArray
fBaskets: UnROOT.TObjArray
fBasketBytes: Array{Int32}((10,)) Int32[116, 0, 0, 0, 0, 0, 0, 0, 0, 0]
fBasketEntry: Array{Int64}((10,)) [0, 25, 0, 0, 0, 0, 0, 0, 0, 0]
fBasketSeek: Array{Int64}((10,)) [238, 0, 0, 0, 0, 0, 0, 0, 0, 0]
fFileName: String ""
julia> seek(f.fobj, 238)
IOStream(<file test/samples/tree_with_histos.root>)
julia> basketkey = UnROOT.unpack(f.fobj, UnROOT.TKey)
UnROOT.TKey64(116, 1004, 100, 0x6526eafb, 70, 0, 238, 100, "TBasket", "mynum", "t1")
julia> s = UnROOT.datastream(f.fobj, basketkey)
Compressed datastream of 100 bytes at 289 (TKey 'mynum' (TBasket))
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=100, maxsize=Inf, ptr=1, mark=-1)
julia> [UnROOT.readtype(s, Int32) for _ in 1:f["t1"]["mynum"].fEntries]
Compressed datastream of 1317 bytes at 6180 (TKey 't1' (TTree))
25-element Array{Int32,1}:
0
1
2
3
4
5
6
7
8
9
10
10
10
10
10
✨
Contributors Thanks goes to these wonderful people (emoji key):
Tamas Gal |
Jerry Ling |
Johannes Schumann |
Nick Amin |
Mosè Giordano |
Oliver Schulz |
Misha Mikhasenko |
This project follows the all-contributors specification. Contributions of any kind welcome!