All Projects → webfactory → zauberlehrling

webfactory / zauberlehrling

Licence: MIT License
Collection of tools and ideas for splitting up big monolithic PHP applications in smaller parts.

Programming Languages

PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to zauberlehrling

Composer
Dependency Manager for PHP
Stars: ✭ 25,994 (+92735.71%)
Mutual labels:  composer, packages
Satis On Heroku
Your private Satis instance on Heroku, just one click away.
Stars: ✭ 5 (-82.14%)
Mutual labels:  composer, packages
Hookphp
HookPHP基于C扩展搭建内置AI编程的架构系统-支持微服务部署|热插拔业务组件-集成业务模型|权限模型|UI组件库|多模板|多平台|多域名|多终端|多语言-含常驻内存|前后分离|API平台|LUA QQ群:679116380
Stars: ✭ 575 (+1953.57%)
Mutual labels:  composer, microservice
Jwt Auth Guard
JWT Auth Guard for Laravel and Lumen Frameworks.
Stars: ✭ 319 (+1039.29%)
Mutual labels:  composer, packages
session
Aplus Framework Session Library
Stars: ✭ 170 (+507.14%)
Mutual labels:  files, composer
Composer Asset Plugin
NPM/Bower Dependency Manager for Composer
Stars: ✭ 898 (+3107.14%)
Mutual labels:  composer, assets
Table-Detection-Extraction
Detect the tables in a form and extract the tables as well as the cells of the tables.
Stars: ✭ 35 (+25%)
Mutual labels:  extraction, tables
Hyde
Call of Duty XAsset compiler that transforms raw assets into digestible data.
Stars: ✭ 15 (-46.43%)
Mutual labels:  files, tables
Asset Packagist
Asset Packagist
Stars: ✭ 235 (+739.29%)
Mutual labels:  composer, assets
Ansible Role Composer
Ansible Role - Composer PHP Dependency Manager
Stars: ✭ 149 (+432.14%)
Mutual labels:  composer, packages
composer-diff
Compares composer.lock changes and generates Markdown report so you can use it in PR description.
Stars: ✭ 51 (+82.14%)
Mutual labels:  composer, packages
assets
Inpsyde Assets is a Composer package (not a plugin) that allows to deal with scripts and styles in a WordPress site.
Stars: ✭ 30 (+7.14%)
Mutual labels:  composer, assets
composer-inheritance-plugin
Opinionated version of Wikimedia composer-merge-plugin to work in pair with Bamarni composer-bin-plugin.
Stars: ✭ 20 (-28.57%)
Mutual labels:  composer
kubernetes-go-grpc
Microservices using Go, gRPC and Kubernates
Stars: ✭ 35 (+25%)
Mutual labels:  microservice
imageup
🎑 ⬆️ A high speed image manipulation and storage microservice for Google Cloud Platform written in Go
Stars: ✭ 33 (+17.86%)
Mutual labels:  microservice
eloquent-mongodb-repository
Eloquent MongoDB Repository Implementation
Stars: ✭ 18 (-35.71%)
Mutual labels:  composer
webpack-asset-pipeline
🚀 A missing link for the asset pipeline alternative with Webpack.
Stars: ✭ 31 (+10.71%)
Mutual labels:  assets
game 01
Scalable MMORPG game server based on entity control
Stars: ✭ 19 (-32.14%)
Mutual labels:  microservice
rc-coffee-chats
A service that matches community members for chats
Stars: ✭ 16 (-42.86%)
Mutual labels:  microservice
Stanford-NER-Python
Stanford Named Entity Recognizer (NER) - Python Wrapper
Stars: ✭ 63 (+125%)
Mutual labels:  extraction

zauberlehrling

Build Status Coverage Status Scrutinizer Code Quality

A collection of tools and ideas for splitting up a big monolithic PHP application in smaller parts, i.e. smaller applications and microservices. It contains console commands for identifying potentially unused PHP files, Composer packages, MySQL tables and public web assets.

The name "zauberlehrling" derives from the famous poem by Johann Wolfgang von Goethe (you may have also seen the iconic cartoon "Fantasia" by Walt Disney). In these tales, a sorcerer's apprentice splits up a magical, out of control broom with an axe. Unfortunately for him, each piece has a life of it's own and only multiplies the problem.

Installation

git clone https://github.com/webfactory/zauberlehrling.git
cd zauberlehrling
composer install

When asked for the database parameters, provide the information for your local database of the monolith. If your monolith has no database or you don't want any help with it, stay with the default parameters.

Splitting up the monolith

Your local development environment

At this point, you probably know your monolith way to well. You've fixed devious bugs and if you're brave/ruthless enough, you might even have added a feature. So I guess you've set up your local development environment already.

Just a tip: During the split, you might wish to do several dumps of your production database. Consider slimdump for storing configurations. These configurations are really handy, as they can be shared among your coworkers and provide neat features. E.g. you can ignore more and more tables that emerged to be irrelevant for your extracted application; you can also ignore BLOB columns or dump only rows matching certain conditions for speeding up the dump process. And you can easily anonymize personalized data to protect your customers.

Greenfield or Brownfield?

The answer to this question seems to depend mostly on the amount of code you want to reuse. If you know you want to replace e.g. an old integrated messaging system with a shiny new microservice (i.e. a partial rewrite of the monolith), you'll probably be fine with a greenfield project with your best and latest technology.

But if you just want to split up the monolith and you're afraid of hidden dependencies, or if you want to keep down your effort and rewrite only what's necessary: my guess is you'll be better off with a brownfield project. Clone the monolith's repository to keep it's history of commit messages. I find it there is often much knowledge in these messages and linked ticket systems. Sometimes they're the only chance to get an understanding for the reasoning of a particular crazy piece of code.

Then, get rid of everything you don't need in your extracted application. The following chapters may help.

Also, my advice is to keep a separate local working copy of the monolith. Sooner or later you'll probably encounter an error you cannot pinpoint to one of your refactorings, or you notice you've deleted too much and you cannot restore it easily from your VCS. In this cases, you'll be happy to have a quick look into the working monolith.

Determine used PHP files

To determine the used PHP files, I suggest writing black box tests for each use case of your application and collect the code coverage information during their execution.

For the black box tests, e.g. you could write behat tests for

  • requesting the homepage
  • log in of a user
  • send a search form and retrieve results
  • create, edit an delete an entity
  • request a page without proper permissions
  • ...

Depending on your project you may want to assert different things. In my experience, the following assertions were often helpful:

  • correct URL (i.e. the user is not being redirected e.g. due to authorization problems)
  • HTTP status code being 200 (i.e. the user got no fancy error page)
  • text content like "x was saved in the database" (to detect failures after form submission) - may seem brittle for a test, but I don't expect that message to be changed while you're extracting your microservice.

Now for the code coverage part. Most frameworks provide a frontcontroller, e.g. for Symfony it's web/symfony-webapp.php. If you have xdebug installed, you can write at the beginning of such a frontcontroller:

xdebug_start_code_coverage();

and at it's end something like this:

$filePointer = fopen($outputFile, 'ab');
fwrite($filePointer, implode(PHP_EOL, array_keys(xdebug_get_code_coverage())));
fclose($filePointer);

Now, when you execute your behat tests, all executed PHP files will be written to $outputFile. I don't recommend executing your unit tests now, as these tests could cover code never used in production.

You're in no way restricted to xdebug for collecting your coverage. E.g., you could also do some Aspect Oriented Programming (AOP) magic, just remember it may have more advanced requirements than your monolith runtime environment can fulfill. Another idea is utilizing sysdig or some other form of file system monitoring.

File system monitoring tools can be tricky to use:

  • You have to make sure the file system access you wish to log are done in reality. If the file system access is cached away by some shady component in your environment, you won't get all used files (false negatives).
  • Some tools like to index all files - e.g. for a desktop search or your IDE for static code analysis. You have to stop them from opening all files during your logging session or you will get too many results (false positives).

But if you manage to set up everything fine, file system monitoring tools have one big advantage: they're not restricted to logging executed PHP files, but can report accessed files of all sorts, e.g. configuration files. That improves the detection of (un)used packages.

For sysdig, you might want to try:

sudo sysdig -p "%fd.name" evt.type=open |grep "/your/project/" |grep -v "/your/project/tmp/" |grep -v "/your/project/log/" > used-files.txt

You can consolidate this file (removing duplicates and sort the file names list) with

bin/console consolidate-used-files usedFiles

where the usedFiles argument is the path to the file containing the list of used files. It will be overwritten with it's consolidated version.

Unused PHP files

bin/console show-unused-php-files [--pathToInspect=...] [--pathToOutput=...] [--pathToBlacklist=...] usedFiles

With this argument:

and these options:

  • -p, --pathToInspect: Path to the directory to search for PHP files. If not set, it will be determined as the common parent path of the used files.

  • -o, --pathToOutput: Path to the output file. If not set, it will be "potentially-unused-files.txt" next to the file named in the usedFiles argument.

  • -b, --pathToBlacklist: Path to a file containing a blacklist of regular expressions to exclude from the output. The blacklist may grow over time. At first, you might want to exclude temp directories and libraries. But as you inspect the list of potentially unused files, you may notice some file definitely needed by your application, although the usage is not detected by your tests. You can persist such insights in this blacklist.

    The file should contain one regular expression per line, e.g.:

    #/var/www/my-project/features/.*# 
    #/var/www/my-project/tmp/.*# 
    #/var/www/my-project/vendor/.*# 
    #/var/www/my-project/file-only-used-in-production-environment.php# 
    

Unused Composer packages

bin/console show-unused-composer-packages [--vendorDir=...] composerJson usedFiles

With these arguments:

  • composerJson: path to the composer.json of the project to analyze
  • usedFiles: path to a file containing the list of used files (see Determine used PHP files)

And these options:

  • -l, vendorDir: path to the vendor directory of the project to analyze. Defaults to the directory of the composer.json + '/vendor'.
  • -b, --pathToBlacklist Path to a file containing a blacklist of regular expressions to exclude from the output (see Unused PHP files for details).

Unused Public Assets

bin/console show-unused-public-assets [--regExpToFindFile=...] [--pathToOutput=...] [--pathToBlacklist=...] pathToPublic pathToLogFile

With these arguments:

  • pathToPublic: Path to the public web root of your project.
  • pathToLogFile: Path to the web server's access log file.

And these options:

  • -r, --regExpToFindFile Regular expression for the log file capturing the path of the accessed file as it's first capture group. Defaults to #"(?:get|post) ([a-z0-9\_\-\.\/]*)#i.
  • -o, --pathToOutput Path to the output file. If not set, it will be "potentially-unused-public-assets.txt" in the folder above the public web root.
  • -b, --pathToBlacklist Path to a file containing a blacklist of regular expressions to exclude from the output (see Unused PHP files for details).

Unused MySQL Tables

So, you've cloned your code base, and you have probably copied your database as well. How do you find the unused tables?

The idea is analogous to the code coverage. First, enable logging in MySQL and possibly delete old log date, e.g. with

SET global general_log = 1;
SET global log_output = 'table';
TRUNCATE mysql.general_log;

Then execute your tests for all use cases of your application. Afterwards, you can disable MySQL logging with

SET global general_log = 0;

Finally, call the following console command:

bin/console show-unused-mysql-tables

Credits, Copyright and License

This bundle was started at webfactory GmbH, Bonn.

Copyright 2016-2017 webfactory GmbH, Bonn. Code released under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].