Community Data Dump

A list of Stack Exchange Community Creative Commons Data Dump releases.

Releases

Torrents and Internet Archive items that were created by the company are indicated with the 🏒 emoji, while those created by community members are indicated with the πŸ‘₯ emoji.

Regular Releases

All known releases of the Stack Exchange Network's Creative Commons Community Data Dump, containing all non-deleted posts from non-beta communities, with links to download using BitTorrent or from the Internet Archive. Some releases are no longer available.

The primary version of each release is a collection of XML files in compressed 7ZIP containers, representing the database schema described in this post. These are available in an unofficial Torrent RSS Feed. Some releases are also available as unofficial conversions to other formats.

Release Size Type Torrent File Magnet Link Internet Archive
December 2024 366 files 88 GiB XML 9132fa09 πŸ‘₯ 9132fa09 πŸ‘₯ Archived πŸ‘₯
September 2024 366 files 90 GiB XML 745b43ba πŸ‘₯ 745b43ba πŸ‘₯ Archived πŸ‘₯
June 2024 366 files 78 GiB XML 530a2ef6 πŸ‘₯ 530a2ef6 πŸ‘₯ Archived πŸ‘₯
…initial buggy release 364 files 90 GiB XML 5afacf7e πŸ‘₯ 5afacf7e πŸ‘₯ Archived πŸ‘₯
…as JSON 370 files 72 GiB JSON d3caab0e πŸ‘₯ d3caab0e πŸ‘₯ Archived πŸ‘₯
…as SQLite database 365 files 82 GiB SQLite e97c4bf9 πŸ‘₯ e97c4bf9 πŸ‘₯ Archived πŸ‘₯
April 2024 383 files 92 GiB XML 2ef5246c 🏒 2ef5246c 🏒 Archived πŸ‘₯
…as JSON 372 files 71 GiB JSON 8e412899 πŸ‘₯ 8e412899 πŸ‘₯ Archived πŸ‘₯
…as SQLite database 366 files 81 GiB SQLite a9396346 πŸ‘₯ a9396346 πŸ‘₯ Archived πŸ‘₯
…as SQLite (no FKs) 365 files 76 GiB SQLite b86780fe πŸ‘₯ b86780fe πŸ‘₯ Archived πŸ‘₯
…as Postgres database 5096 files 66 GiB pg_dump 3e618152 πŸ‘₯ 3e618152 πŸ‘₯ Archived πŸ‘₯
…initial buggy release 383 files 103 GiB XML ab12faae 🏒 ab12faae 🏒 Archived πŸ‘₯
March 2024 383 files 85 GiB XML 5a3d9b0c 🏒 5a3d9b0c 🏒 Archived πŸ‘₯
December 2023 383 files 84 GiB XML 414f55d7 🏒 414f55d7 🏒 Archived πŸ‘₯
September 2023 383 files 83 GiB XML d884b3a9 🏒 d884b3a9 🏒 Archived πŸ‘₯
June 2023 379 files 82 GiB XML 93d9b232 🏒 93d9b232 🏒 Archived πŸ‘₯
March 2023 379 files 81 GiB XML 263d19f9 🏒 263d19f9 🏒 Archived πŸ‘₯
December 2022 379 files 72 GiB XML 1c65c4a7 🏒 1c65c4a7 🏒 Archived πŸ‘₯
October 2022 379 files 80 GiB XML b369adf8 🏒 b369adf8 🏒 Archived πŸ‘₯
June 2022 377 files 79 GiB XML 8225cdd0 🏒 8225cdd0 🏒 Archived πŸ‘₯
March 2022 368 files 77 GiB XML 5854d890 🏒 5854d890 🏒 Archived πŸ‘₯
December 2021 363 files 75 GiB XML 677a281a 🏒 677a281a 🏒 Archived πŸ‘₯
September 2021 363 files 75 GiB XML 27689c6b 🏒 27689c6b 🏒 Archived πŸ‘₯
June 2021 364 files 73 GiB XML ee64468a 🏒 ee64468a 🏒 Archived πŸ‘₯
March 2021 362 files 71 GiB XML 03ae768e 🏒 03ae768e 🏒 Archived πŸ‘₯
December 2020 361 files 70 GiB XML 3fa542ce 🏒 3fa542ce 🏒 Archived πŸ‘₯
September 2020 362 files 68 GiB XML a321f0df 🏒 a321f0df 🏒 Archived πŸ‘₯
June 2020 362 files 66 GiB XML daceb39f 🏒 daceb39f 🏒 Archived πŸ‘₯
March 2020 358 files 64 GiB XML a6266395 🏒 a6266395 🏒 Archived πŸ‘₯
December 2019 358 files 62 GiB XML fd11cc26 🏒 fd11cc26 🏒 Archived πŸ‘₯
September 2019 358 files 61 GiB XML 8cec36af 🏒 Archived πŸ‘₯
June 2019 358 files 59 GiB XML 576097b0 🏒 576097b0 🏒 Archived πŸ‘₯
March 2019 358 files 57 GiB XML f8b00f75 🏒 f8b00f75 🏒 Archived πŸ‘₯
December 2018 362 files 55 GiB XML d1623b9e 🏒 d1623b9e 🏒 Archived πŸ‘₯
September 2018 XML bddb28e2 🏒
June 2018 XML 8e6a46cc 🏒
March 2018 358 files 50 GiB XML 4ad8edb2 🏒 4ad8edb2 🏒 Archived πŸ‘₯
December 2017 352 files 48 GiB XML e73b7025 🏒 e73b7025 🏒 Archived πŸ‘₯
June 2017 340 files 45 GiB XML fa38c0e9 🏒 fa38c0e9 🏒
March 2017 338 files 42 GiB XML 586eebe6 🏒 586eebe6 🏒
December 2016 334 files 40 GiB XML fd86cb81 🏒 fd86cb81 🏒
September 2016 330 files 37 GiB XML be15f508 🏒 be15f508 🏒
June 2016 324 files 36 GiB XML e1a5e02e 🏒 e1a5e02e 🏒
March 2016 314 files 33 GiB XML 57ceb5ac 🏒 57ceb5ac 🏒
March 2015 281 files 25 GiB XML 0ea39049 🏒 0ea39049 🏒 Archived πŸ‘₯
September 2014 264 files 22 GiB XML b1a458cb 🏒 b1a458cb 🏒
May 2014 250 files 20 GiB XML 3aed037c 🏒 3aed037c 🏒
January 2014 231 files 18 GiB XML cc28f62b 🏒 cc28f62b 🏒
September 2013 96 files 15 GiB XML c472a68f 🏒 c472a68f 🏒
June 2013 92 files 13 GiB XML 5ede0d56 🏒 5ede0d56 🏒 Archived πŸ‘₯
March 2013 91 files 12 GiB XML 47e02c81 🏒 47e02c81 🏒 Archived πŸ‘₯
August 2012 - 1/2 50 files 7707 MiB XML ecf5ec6d 🏒 ecf5ec6d 🏒
August 2012 - 2/2 37 files 44 MiB XML b06fff72 🏒 b06fff72 🏒 Archived πŸ‘₯
May 2012 76 files 6369 MiB XML c0ecc66e 🏒 c0ecc66e 🏒 Archived πŸ‘₯
December 2011 - 1/3 45 files 4997 MiB XML 77ccfd34 🏒 77ccfd34 🏒 Archived πŸ‘₯
December 2011 - 2/3 3 files 156 MiB XML ccb07610 🏒 ccb07610 🏒
December 2011 - 3/3 25 files 284 MiB XML de13c16e 🏒 de13c16e 🏒 Archived πŸ‘₯
September 2011 66 files 4413 MiB XML 64d052f7 🏒 64d052f7 🏒 Archived πŸ‘₯
June 2011 55 files 3666 MiB XML cdb37ea2 🏒 cdb37ea2 🏒 Archived πŸ‘₯
April 2011 57 files 3987 MiB XML ec636f7e 🏒 ec636f7e 🏒 Archived πŸ‘₯
January 2011 42 files 3209 MiB XML 16e7c286 🏒 16e7c286 🏒 Archived πŸ‘₯
November 2010 18 files 2076 MiB XML f26b0ae2 🏒 f26b0ae2 🏒 Archived πŸ‘₯
October 2010 4 files 1020 MiB XML b549412a 🏒 b549412a 🏒 Archived πŸ‘₯
September 2010 4 files 1027 MiB XML 8f5c7a0b 🏒 8f5c7a0b 🏒 Archived πŸ‘₯
August 2010 4 files 887 MiB XML 53c3bf7f 🏒 53c3bf7f 🏒 Archived πŸ‘₯
July 2010 4 files 824 MiB XML fe29a5a7 🏒 fe29a5a7 🏒 Archived πŸ‘₯
June 2010 4 files 764 MiB XML 2c802d00 🏒 2c802d00 🏒 Archived πŸ‘₯
May 2010 4 files 710 MiB XML f318086e 🏒 f318086e 🏒 Archived πŸ‘₯
April 2010 4 files 658 MiB XML 53adea3f 🏒 53adea3f 🏒 Archived πŸ‘₯
March 2010 4 files 599 MiB XML a790a8ab 🏒 a790a8ab 🏒 Archived πŸ‘₯
February 2010 4 files 546 MiB XML cec670fa 🏒 cec670fa 🏒 Archived πŸ‘₯
January 2010 4 files 500 MiB XML 5127f67f 🏒 5127f67f 🏒 Archived πŸ‘₯
December 2009 4 files 454 MiB XML 11a2a48f 🏒 11a2a48f 🏒 Archived πŸ‘₯
November 2009 4 files 407 MiB XML 107f674c 🏒 107f674c 🏒 Archived πŸ‘₯
October 2009 4 files 337 MiB XML 61b882a5 🏒 61b882a5 🏒 Archived πŸ‘₯
September 2009 4 files 302 MiB XML 395bb1a8 🏒 395bb1a8 🏒 Archived πŸ‘₯
August 2009 1 file 267 MiB XML d348861b 🏒 d348861b 🏒 Archived πŸ‘₯
…reupload 4 files 267 MiB XML f6ec47fb 🏒 f6ec47fb 🏒 Archived πŸ‘₯
July 2009 1 file 230 MiB XML 2dca38c1 🏒 2dca38c1 🏒 Archived πŸ‘₯
June 2009 1 file 206 MiB XML 68d22f0f 🏒 68d22f0f 🏒 Archived πŸ‘₯
May 2009 1 file 200 MiB XML ea45080e 🏒 ea45080e 🏒 Archived πŸ‘₯

Special Releases

Release Size Type Torrent File Magnet Link Internet Archive
Closed Sites 46 files 97 MiB XML 3983ede3 πŸ‘₯ 3983ede3 πŸ‘₯ Archived πŸ‘₯
Documentation 3 files 35 MiB JSON 7b4762eb 🏒 7b4762eb 🏒 Archived 🏒
Teams 3 files 122 KiB JSON 6e8f46d3 🏒 6e8f46d3 🏒 Archived 🏒
Uploaded Images 65 files 800 GiB images Archived πŸ‘₯

Disclosure

This site is not affiliated with Stack Exchange Inc.

Background

From its inception in September 2008, user content on Stack Overflow (and later the broader Stack Exchange Network) has been released under a Creative Commons License. Starting in June 2009, the company began releasing periodic "data dumps" containing most types of user-contributed content on the site. These were available for download through BitTorrent, so that the community would always be able to continue seeding the content so it remains available, even if the company themselves ceased to do so. From January 2014 to April 2024, the latest release was also available for download from the Internet Archive.

This was intended to help set Stack Overflow apart from some of the Q&A sites that had existed in the past, which kept user contributions under restrictive licenses and tight control, and tended to become more user-hostile over time. This spirit was described by Co-Founder and CEO Joel Spolsky, in a podcast on February 20, 2010:

From day one, we used the CC-wiki license. It's basically a license that says that we don't own the content that's on there, which is why we make those database dumps that are available.

We wanted to make sure that if no matter what happens, literally no matter who we sell to, or raise money from, or turn the site over to, and even if they take Stack Overflow, and make it an evil site where you have to pay to look at things and there's pop-up ads and pop-under ads, and you know, dancing chariots of fire that cross the screen and punch the monkey, and [...] it just becomes a big gigantic spam site.

Doesn't matter because you can just take the latest CC-wiki download that we provided and go start your own site saying "you know what, this is gonna be the clean version". And I think a lot of people will follow you. We very, very deliberately built Stack Overflow in a way that there wouldn't be any chance of locking it down.

Unfortunately, in July 2024 the company announced that going forward, the data dumps would not be released as torrents or uploaded to the Internet Archive. Instead, users would need to log in to download them directly from the Stack Exchange website, so that they could monitor who was downloading them, with the explicit threat that they would block users from downloading if they were using the data for a purpose that the company didn't approve of. Additionally, there would no longer be a single release containing data from the entire network. Instead, users would need to manually go to each of the 368 sites in the network to download its data.

In order to preserve open access to Stack Exchange community data, some community members will be going through the effort to download the dumps from all sites across the network, and re-distributing the complete collection as torrents resembling those that were previously provided by the company. Some of these will be in original XML format, while others will be converted to other formats that are more convenient to use directly.

This site exists to provide an index of all releases, including community-aggregated ones going forward as well as the company-provided ones from the past.

License

Each item in these releases is copyright its respective authors and editors, who are identified in the release's data. Each item is distributed under a Creative Commons Attribution-ShareAlike license (CC BY-SA, previously also known as CC-Wiki), but the exact version of the license depends on the date that on which the item was contributed. According to stackoverflow.com/help/licensing:

Please see stackoverflow.blog/2009/06/25/attribution-required for guidance on how you are expected to attribute content to its authors.

This page uses some CSS from the Stacks design system, which is copyright Stack Exchange, Inc and released under the MIT License.