All Projects → michaeljs1990 → jmem

michaeljs1990 / jmem

Licence: GPL-2.0 license
Break up huge JSON arrays into manageable sizes.

Programming Languages

PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to jmem

sequence labeling tf
Sequence Labeling in Tensorflow
Stars: ✭ 18 (+28.57%)
Mutual labels:  chunking
deduplication
Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with extensive tests, a fuzz test and a benchmark.
Stars: ✭ 59 (+321.43%)
Mutual labels:  chunking
md-svg-vue
Material design icons by Google for Vue.js & Nuxt.js (server side support & inline svg with path)
Stars: ✭ 14 (+0%)
Mutual labels:  chunking
NotEnoughAV1Encodes-Qt
Linux GUI for AV1 Encoders
Stars: ✭ 27 (+92.86%)
Mutual labels:  chunking
esa-httpclient
An asynchronous event-driven HTTP client based on netty.
Stars: ✭ 82 (+485.71%)
Mutual labels:  chunking
Ncrfpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+12521.43%)
Mutual labels:  chunking
GrammarEngine
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+385.71%)
Mutual labels:  chunking

Jmem

Iterate through large JSON arrays without eating up all your memory.

Example Use

To start using jmem add the following line to your composer.json file.

{
    "require": {
        "mschuett/jmem": "dev-master"
    }
}

Next setup your autloaded and away you go.

<?php require 'vendor/autoload.php';

$gen = new Jmem\JsonLoader("Hangouts.json", "conversation_state");

foreach($gen->parse()->start() as $obj) {

    $obj->stream;

}

About

Jmem was written to parse huge JSON files when trying to put google hangouts json data a 200MB+ file into a database. PHP is likely not the best tool for this job however when it's the only thing available it does just fine. It currently takes about 1.5 minutes to break up a 200MB file so running this in the background after an upload is your best bet. A generator is used as to not take up large chunks of memory at any time.

Documentation

I have comented the code extreamly well so it is very easy to look through the source code and make changes. With that said here is the main JsonLoader Class.

/**
 * $file is the path to the file that you would like to parse though.
 * If the file does not exist an exception will be thrown. Element is
 * the array of objects you would like to have broken up and returned
 * to you.
 * 
 * @param String $file
 * @param String $element
 * @param int $bytes (1024 default)
 */

The generator returns a JSON object that gives you access to the following. Stream contains the full json object and number is the place of of the object in the array starting at 1.

class JsonObject {

    public $number = 0;

	public $stream = "";

    public function __construct($stream, $num) {
        $this->stream = $stream;
        $this->number = $num;
    }

}

Contribute

Please feel free to post bugs you may find or submit a pull request. I love feedback good and bad!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].