All Projects → pythonicrubyist → Creek

pythonicrubyist / Creek

Licence: mit
Ruby library for parsing large Excel files.

Programming Languages

ruby
36898 projects - #4 most used programming language

Projects that are alternatives of or similar to Creek

excel-merge
A PHP library to merge two or more Excel files into one
Stars: ✭ 26 (-90.37%)
Mutual labels:  excel, xlsx
exoffice
Library to parse common excel formats (xls, xlsx, csv)
Stars: ✭ 31 (-88.52%)
Mutual labels:  excel, xlsx
fxl.js
ƛ fxl.js is a data-oriented JavaScript spreadsheet library. It provides a way to build spreadsheets using modular, lego-like blocks.
Stars: ✭ 27 (-90%)
Mutual labels:  excel, xlsx
MiniExcel
Fast, Low-Memory, Easy Excel .NET helper to import/export/template spreadsheet
Stars: ✭ 996 (+268.89%)
Mutual labels:  excel, xlsx
Unioffice
Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents
Stars: ✭ 3,111 (+1052.22%)
Mutual labels:  excel, xlsx
excelizor
A simple tool to export .xlsx files to lua-table, json and their corresponding csharp classes and golang structs
Stars: ✭ 35 (-87.04%)
Mutual labels:  excel, xlsx
lisp-xl
Common Lisp Microsoft XLSX (Microsoft Excel) loader for arbitrarily-sized / big-size files
Stars: ✭ 27 (-90%)
Mutual labels:  excel, xlsx
eec
A fast and lower memory excel write/read tool.一个非POI底层,支持流式处理的高效且超低内存的Excel读写工具
Stars: ✭ 93 (-65.56%)
Mutual labels:  excel, xlsx
goxlsxwriter
Golang bindings for libxlsxwriter for writing XLSX files
Stars: ✭ 18 (-93.33%)
Mutual labels:  excel, xlsx
XToolset
Typed import, and export XLSX spreadsheet to JS / TS. Template-based create, render, and export data into excel files.
Stars: ✭ 110 (-59.26%)
Mutual labels:  excel, xlsx
spark-hadoopoffice-ds
A Spark datasource for the HadoopOffice library
Stars: ✭ 36 (-86.67%)
Mutual labels:  excel, xlsx
ExcelFormulaBeautifier
Excel Formula Beautifer,make Excel formulas more easy to read,Excel公式格式化/美化,将Excel公式转为易读的排版
Stars: ✭ 27 (-90%)
Mutual labels:  excel, xlsx
umya-spreadsheet
A pure rust library for reading and writing spreadsheet files
Stars: ✭ 79 (-70.74%)
Mutual labels:  excel, xlsx
sheet2dict
Simple XLSX and CSV to dictionary converter
Stars: ✭ 206 (-23.7%)
Mutual labels:  excel, xlsx
spreadsheet
Yii2 extension for export to Excel
Stars: ✭ 79 (-70.74%)
Mutual labels:  excel, xlsx
hfexcel
JSON to Excel in Python. 🐍 Human Friendly excel creation in python. 📄 easy, advanced and smart api. json to excel conversion support.. ❤️
Stars: ✭ 16 (-94.07%)
Mutual labels:  excel, xlsx
xlsx reader
A production-ready XLSX file reader for Elixir.
Stars: ✭ 46 (-82.96%)
Mutual labels:  excel, xlsx
OpenSpreadsheet
OpenSpreadsheet provides an easy-to-use wrapper around the OpenXML spreadsheet SAX API. It specializes in efficiently reading and writing between strongly typed collections and worksheets.
Stars: ✭ 24 (-91.11%)
Mutual labels:  excel, xlsx
svelte-sheets
Blazing fast excel sheets in the browser, hugely inspired by JExcel, built with Svelte and XLSX.
Stars: ✭ 45 (-83.33%)
Mutual labels:  excel, xlsx
excel validator
Python script to validate data in Excel files
Stars: ✭ 14 (-94.81%)
Mutual labels:  excel, xlsx

version downloads

Creek - Stream parser for large Excel (xlsx and xlsm) files.

Creek is a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.

Installation

Creek can be used from the command line or as part of a Ruby web framework. To install the gem using terminal, run the following command:

gem install creek

To use it in Rails, add this line to your Gemfile:

gem 'creek'

Basic Usage

Creek can simply parse an Excel file by looping through the rows enumerator:

require 'creek'
creek = Creek::Book.new 'spec/fixtures/sample.xlsx'
sheet = creek.sheets[0]

sheet.rows.each do |row|
  puts row # => {"A1"=>"Content 1", "B1"=>nil, "C1"=>nil, "D1"=>"Content 3"}
end

sheet.simple_rows.each do |row|
  puts row # => {"A"=>"Content 1", "B"=>nil, "C"=>nil, "D"=>"Content 3"}
end

sheet.rows_with_meta_data.each do |row|
  puts row # => {"collapsed"=>"false", "customFormat"=>"false", "customHeight"=>"true", "hidden"=>"false", "ht"=>"12.1", "outlineLevel"=>"0", "r"=>"1", "cells"=>{"A1"=>"Content 1", "B1"=>nil, "C1"=>nil, "D1"=>"Content 3"}}
end

sheet.simple_rows_with_meta_data.each do |row|
  puts row # => {"collapsed"=>"false", "customFormat"=>"false", "customHeight"=>"true", "hidden"=>"false", "ht"=>"12.1", "outlineLevel"=>"0", "r"=>"1", "cells"=>{"A"=>"Content 1", "B"=>nil, "C"=>nil, "D"=>"Content 3"}}
end

sheet.state   # => 'visible'
sheet.name    # => 'Sheet1'
sheet.rid     # => 'rId2'

Filename considerations

By default, Creek will ensure that the file extension is either *.xlsx or *.xlsm, but this check can be circumvented as needed:

path = 'sample-as-zip.zip'
Creek::Book.new path, :check_file_extension => false

By default, the Rails file_field_tag uploads to a temporary location and stores the original filename with the StringIO object. (See this section of the Rails Guides for more information.)

Creek can parse this directly without the need for file upload gems such as Carrierwave or Paperclip by passing the original filename as an option:

# Import endpoint in Rails controller
def import
  file = params[:file]
  Creek::Book.new file.path, check_file_extension: false
end

Parsing images

Creek does not parse images by default. If you want to parse the images, use with_images method before iterating over rows to preload images information. If you don't call this method, Creek will not return images anywhere.

Cells with images will be an array of Pathname objects. If an image is spread across multiple cells, same Pathname object will be returned for each cell.

sheet.with_images.rows.each do |row|
  puts row # => {"A1"=>[#<Pathname:/var/folders/ck/l64nmm3d4k75pvxr03ndk1tm0000gn/T/creek__drawing20161101-53599-274q0vimage1.jpeg>], "B2"=>"Fluffy"}
end

Images for a specific cell can be obtained with images_at method:

puts sheet.images_at('A1') # => [#<Pathname:/var/folders/ck/l64nmm3d4k75pvxr03ndk1tm0000gn/T/creek__drawing20161101-53599-274q0vimage1.jpeg>]

# no images in a cell
puts sheet.images_at('C1') # => nil

Creek will most likely return nil for a cell with images if there is no other text cell in that row - you can use images_at method for retrieving images in that cell.

Remote files

remote_url = 'http://dev-builds.libreoffice.org/tmp/test.xlsx'
Creek::Book.new remote_url, remote: true

Mapping cells with header names

By default, Creek will map cell names with letter and number(A1, B3 and etc). To be able to get cell values by header column name use with_headers (can be used only with #simple_rows method!!!) during creation (Note: header column is first string of sheet)

creek = Creek::Book.new file.path, with_headers: true

Contributing

Contributions are welcomed. You can fork a repository, add your code changes to the forked branch, ensure all existing unit tests pass, create new unit tests which cover your new changes and finally create a pull request.

After forking and then cloning the repository locally, install the Bundler and then use it to install the development gem dependencies:

gem install bundler
bundle install

Once this is complete, you should be able to run the test suite:

rake

There are some remote tests that are excluded by default. To run those, run

bundle exec rspec --tag remote

Bug Reporting

Please use the Issues page to report bugs or suggest new enhancements.

License

Creek has been published under MIT License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].