All Projects → citizenlab → Chat Censorship

citizenlab / Chat Censorship

Data related to investigation of chat client censorship

Programming Languages

lua
6591 projects

Projects that are alternatives of or similar to Chat Censorship

Covid 19 Timeline
请关注端点星案和张展。// 以社会学年鉴模式体例规范地统编自2019年末起武汉新冠肺炎疫情进展的时间线(2019年12月1日-2020年4月24日)。感谢志愿者的辛劳操作。A sociology timeline (2019.12.1-2020.4.24) on how Wuhan Coronavirus break and spread, edited by anonymous volunteers.
Stars: ✭ 142 (-55.21%)
Mutual labels:  china, censorship
China Dictatorship
Chinese "Communist" "Dictatorship" "facts". 中国《共产主义》《独裁统治》的《事实》。Home to the mega-FAQ, news compilation, restaurant and music recommendations. 常见问答集,新闻集和饭店和音乐建议。Heil Xi 卐. 习万岁。
Stars: ✭ 337 (+6.31%)
Mutual labels:  china, censorship
Gfwlist
The one and only one gfwlist here
Stars: ✭ 19,033 (+5904.1%)
Mutual labels:  china, censorship
Wuhan2019
Lest we forget this pandemic, or at least I won't forget. This project keep an archive for the mainstream media's articles which covering on COVID-19 (2019.12-). Including those deleted by CCP. Update till April 24th, 2020. All rights reserved to the author and his/her organization. 声援陈玫、蔡伟和张展。如果有言论自由,也就不会有新冠疫情的大规模爆发
Stars: ✭ 186 (-41.32%)
Mutual labels:  china, censorship
Identity Address DB
(China) 1. MySQL 身份证 地区 数据库(包含已被合并的区县,详见README) 2. PHP 验证身份证号是否正确 3. 从身份证号中获取 性别 生日 年龄 出生地 等信息 4.路过留个star
Stars: ✭ 38 (-88.01%)
Mutual labels:  china
chinaid
🇨🇳中国大陆身份证号解析/校验
Stars: ✭ 18 (-94.32%)
Mutual labels:  china
2018-flink-forward-china
Flink Forward China 2018 第一届记录,视频记录 | 文档记录 | 不仅仅是流计算 | More than streaming
Stars: ✭ 25 (-92.11%)
Mutual labels:  china
home
这里是GitHub的草场,也是戈戈圈爱好者的交流地,主要讨论动漫、游戏、科技、人文、生活等所有话题,欢迎各位小伙伴们在此讨论趣事。This is GitHub grassland, and the community place for Gege circle lovers, mainly discusses anime, games, technology, lifing and other topics. You are welcome to share interest things here.                                                                                              …
Stars: ✭ 268 (-15.46%)
Mutual labels:  china
Trojan Go
Go实现的Trojan代理,支持多路复用/路由功能/CDN中转/Shadowsocks混淆插件,多平台,无依赖。A Trojan proxy written in Go. An unidentifiable mechanism that helps you bypass GFW. https://p4gefau1t.github.io/trojan-go/
Stars: ✭ 4,049 (+1177.29%)
Mutual labels:  china
Vpnforchina.github.io
2021中国翻墙软件和科学上网推荐指南,最新稳定好用的翻墙VPN推荐,能用的VPN梯子不多了,以及对比VPS自建梯子、SSR机场、V2ray、蓝灯、老王VPN、WireGuard等翻墙软件和科学上网方法。
Stars: ✭ 272 (-14.2%)
Mutual labels:  china
zoning
中华人民共和国行政区划:省级(省份直辖市自治区)、 地级(城市)、 县级(区县)、 乡级(乡镇街道)、 村级(村委会居委会)
Stars: ✭ 110 (-65.3%)
Mutual labels:  china
mapchina
R Package of Geospatial Shapefile of China Administrative Divisions to the County/District-Level.
Stars: ✭ 60 (-81.07%)
Mutual labels:  china
CSharpNamingGuidelines
C#命名规范中文版/C#编码规范中文版
Stars: ✭ 30 (-90.54%)
Mutual labels:  china
AndroidPluggableTransports
Android Pluggable Transports (aka PLUTO2)
Stars: ✭ 23 (-92.74%)
Mutual labels:  censorship
Chinese Independent Developer
👩🏿‍💻👨🏾‍💻👩🏼‍💻👨🏽‍💻👩🏻‍💻中国独立开发者项目列表 -- 分享大家都在做什么
Stars: ✭ 17,381 (+5382.97%)
Mutual labels:  china
react-area-linkage
省市区联动选择: https://dwqs.github.io/react-area-linkage/
Stars: ✭ 52 (-83.6%)
Mutual labels:  china
ACVR2017
An Innovative Salient Object Detection Using Center-Dark Channel Prior
Stars: ✭ 20 (-93.69%)
Mutual labels:  china
Feeluown
trying to be a user-friendly and hackable music player
Stars: ✭ 3,030 (+855.84%)
Mutual labels:  china
coronavirus-stats
Automatically scrape data and statistics on Coronavirus to make them easily accessible in CSV format
Stars: ✭ 47 (-85.17%)
Mutual labels:  china
docker-wrapper
k8s docker 国内镜像 gcr.io quay.io
Stars: ✭ 35 (-88.96%)
Mutual labels:  china

Overview

This repository contains keyword blacklists and lists of other content such as URLs or images used to trigger censorship in apps used in China. With the exception of WeChat, these lists were reverse engineered and are the exhaustive lists of keywords used to trigger censorship on these platforms.

The full details on data collection and analysis methods and results are available below.

Chat apps

The research below tracks daily changes to censorship in three different chat apps used in China: TOM-Skype, Sina UC, and Line. Overall, our chat app data consists of over 4,000 blacklisted keywords.

Data: TOM-Skype and Sina UC, LINE

Live-streaming apps

The research below tracks hourly changes to censorship in three different live streaming apps in China: YY, Sina Show, and 9158; and documents the keywords censored by GuaGua, which does not include a mechanism for downloading updates to its censorship blacklists. Overall, our live-streaming data consists of over 20,000 blacklisted keywords.

Data: Original live-streaming data (2015), Updated live-streaming data (2017)

Mobile games

Our research on mobile games analyzes domestic Chinese games as well as international games that have been altered to comply with Chinese regulations. Overall, we found hundreds of mobile games performing censorship, collectively censoring over 100,000 unique blacklisted keywords.

Data: Mobile games

Open source projects

This research analyzes Chinese censorship in open source projects. We extracted over 1,000 Chinese keyword blacklists from open source projects on GitHub, collectively spanning over 200,000 unique blacklisted keywords.

Data: Open source blacklists

WeChat

Our research on WeChat censorship uses sample testing to determine what type of content, such as words, URLs, and images, can be communicated over the platform and which content is censored. We have studied what categorical content WeChat generally filters in addition to what content WeChat filters in response to specific events.

Data: Keywords and URLs (November 2016), 709 Crackdown keywords and images (April 2017), Liu Xiaobo keywords and images (July 2017), 19th Party Congress keywords (November 2017), Image filtering test data (May 2018)

Keyword Content Analysis

Datasets include raw keyword lists collected from the applications. Many also include processed data including translations and categorization of keywords. Keywords were translated to English using a combination of machine and human translation. Based on interpreting these translations with contextual information, we coded each keyword into content categories grouped under six general themes according to a code book.

License

All data is provided under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International and available in full here and summarized here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].