0% found this document useful (0 votes)
539 views

WhatManager - A Geek's Way of Automating Torrenting and The Road To 40'000 Torrents and Beyond. - Elite - Forums - What

This document discusses the author's approach to managing over 40,000 torrents on What.CD using custom software and optimized hardware. The author developed WhatManager (WM) to automate tasks like downloading, seeding and metadata management. Their system utilizes ZFS for storage, running on a server with IPMI, ECC RAM, and multiple RAIDZ arrays of HDDs totaling over 10TB of storage. WM integrates with What.CD's API and provides scalable tools for tasks like transcoding thousands of torrents.

Uploaded by

devnull
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
539 views

WhatManager - A Geek's Way of Automating Torrenting and The Road To 40'000 Torrents and Beyond. - Elite - Forums - What

This document discusses the author's approach to managing over 40,000 torrents on What.CD using custom software and optimized hardware. The author developed WhatManager (WM) to automate tasks like downloading, seeding and metadata management. Their system utilizes ZFS for storage, running on a server with IPMI, ECC RAM, and multiple RAIDZ arrays of HDDs totaling over 10TB of storage. WM integrates with What.CD's API and provides scalable tools for tasks like transcoding thousands of torrents.

Uploaded by

devnull
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond.

rents and beyond. < Elite < Forums :: What.CD

Forums > Elite > WhatManager: a geek's way of automating


torrenting and the road to 40`000 torrents and beyond.
#5313003 karamanolev (Power TM) 1 week, 1 day ago - [Quote] - [Edit] [Report] ↑

I. Introduction
A. My approach to most things I do with What.CD

II. Enough of this, can you be more specific?


A. Hardware / system setup
B. ZFS
C. Directory structure on top of ZFS
D. Torrent client of choice
E. Software: WM
F. More than a torrent manager
G. Integrated into What.CD
H. A transcoding heaven

III. The future


IV. Conclusion
V. Edit: You can have WM now!
VI. Edit 2: Bibliotik integration

WhatManager: a geek's way of automating torrenting and the road to 40`000


torrents and beyond; transcoding 15`000 torrents without spending a million
years.

Introduction

What.CD is an amazing resource with very devoted members. Many of us spend


hours and hours downloading, listening, scouring the web for new content,
uploading and hundreds of other activities, taking part of what we are building in
our community. There are at least as many ways to do these things as there are
users here. Everyone is managing their directory structure in a different way,
using various tools to manage their torrent clients, etc. Various ecosystems,
communities and tools exist around each major torrent client as well. They all
share one goal: make it easier to manage the content. All of them have various
advantages and disadvantages, mostly along the lines of ease of use, ease of
setup and performance. With 44`000 actively seeding torrents taking up 10.7TB
of hard drive space my setup is not the largest or most expensive one, but it
definitely has given me a perspective on how to do things efficiently and in a
way that will scale in the future. The road to #1 in torrents uploaded
(https://What.CD/top10.php?type=users&details=numul) has also been very
interesting and taught me a lot. The things I’ve done can definitely be achieved
with manual or off-the-shelf solutions, but as a software developer my approach
https://what.cd/forums.php?action=viewthread&threadid=193134 1/8
9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond. < Elite < Forums :: What.CD

with manual or off-the-shelf solutions, but as a software developer my approach


is very different from what most people do. It might take me 4 hours to automate
something that take 5 minutes to do right now, but done a 1000 times it’s many
man-days. What I’ll be talking about is the hardware and software architecture I
have that support the current state and my future plans. I’ll give an overview of
the tools I have developed and the power and scalability they give me.
Scalability will be a major theme - there exist many ways you can deal with a
few thousand torrents, but much fewer when aiming for 100`000 torrents and
beyond. It is important not to have to do redo things in a different way because
they have become unmanageable when you hit some point.

My approach to most things I do with What.CD

Early on being a user of What.CD I realized that I can’t afford to spend the time
to do everything manually, so I needed a tool where I’ll put whatever I needed to
automate. WM was born. Quite uninventively, WM stands for What Manager.
Being around for about a year and a half, it has grown from a few tools patched
together to make a few tasks easier into a quite sophisticated management
platform. This is where I put most of the love I have for What.CD. An important
theme for me is to seed everything I download and to try not to lose torrents.
Data loss is a painful affair and we’ve seen the trouble it has caused fellow
members sharing their experience on the forums. A little part of me dies every
time I think about the unique piece of culture that is lost when a drive without
backup somewhere dies and a torrent at What.CD is deleted for inactivity. This
clearly comes at a conflict with the expensiveness of redundant storage, so I’ve
tried to balance these. I also try to make the collection I’m storing as self-
sufficient as possible in terms of data and metadata. Along with the music I
download, I also store all the metadata attached to it: the metadata returned by
What.CD’s API, when I have added it, the torrent file, if it is a part of a larger
download theme, etc. This is especially important as future-proofing culture is a
must. After all, I don’t see What.CD as a pirating site, but as a gateway to the
past and current culture of humanity.

Enough of this, can you be more specific?

Hardware / system setup

I’ll briefly talk about hardware. It is very important and unimportant. It is what
keeps our data safe and allows us to share it and enjoy it. Because of that we
must take care of it and treat it with respect. But hardware is only the means to
an end. You can have a $100`000 box and still only store text files with recipes
for mashed potatoes. Hardware is not to brag about or an end of itself.
https://what.cd/forums.php?action=viewthread&threadid=193134 2/8
9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond. < Elite < Forums :: What.CD

for mashed potatoes. Hardware is not to brag about or an end of itself.


What I currently use:

Supermicro X9SCA-F - IPMI is an amazing thing for remote


management. ECC memory is important.
3x8GB DDR3 1600 ECC RAM - Sufficiently large error-corrected
memory.
AOC-SAS2LP-MV8 - an inexpensive no-frills disk controller. We’ll talk
about ZFS later, so no RAID needed.
Brand-name 800W power supply
Seagate 160GB boot drive

What.CD data is stored on:

RAIDZ of 3x2TB Seagate Desktop drives


RAIDZ of 3x3TB Western Digital Red NAS drives
RAIDZ of 3x4TB Seagate NAS drives

ZFS

ZFS is the best file system in the world for this use case. Period. It gives you so
much flexibility and safeguards that I haven’t even thought about future life
without it. Silent corruption is a thing with hard drives that is happening at rate a
few orders of magnitude higher than what manufacturers specify and ZFS allows
you to guard against that. If you mess up and delete all your torrents, ZFS is
there to save you. If you have a faulty SATA cable and every so often a byte
comes garbled - ZFS will correct the data and let you know. If you delete all
your torrents by accident - ZFS can get them back. You get the point. I have
chosen to go the way of multiple RAIDZ pools to avoid the single point of
failure of a large pool and increased IOPS by allowing the pools to work
independently.
I highly recommend everyone to try using ZFS. It might turn out that it isn’t
your thing, but it is generally a good idea to a solution to the problems that ZFS
eliminates or eases.

https://what.cd/forums.php?action=viewthread&threadid=193134 3/8
9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond. < Elite < Forums :: What.CD

Directory structure on top of ZFS

I used to store all my torrents with their original names all dumped in a single
directory. This started to cause problems with the difficulty of mapping
directories to torrent ids and torrents with the same name. Currently, the
structure resembles
/mnt/<zfs pool>/What.CD/<torrent id>/<torrent name>
Apart from the torrent contents, the <torrent id> directory also holds the original
.torrent file and a text file, ReleaseInfo.txt, that contains the JSON returned by
the What.CD API for that torrent. This data is also stored in the database, but I
like to have it as files in case the DB is fried or unusable for whatever reason.

Torrent client of choice

Every torrent client has its pros and cons, but it quickly became clear that even
what I consider the most performant client out there, rTorrent, can’t reasonably
handle many tens of thousands of torrents, so I’ll need multiple instances of the
client. That also meant that I’ll need an easy way to manage them which a
custom software, so a decent, easy to use API was a must. Unfortunately,
rTorrent’s API is anything but easy to use, so that was out very quickly. The
client I’m using is Transmission. Definitely not the most high performant, but
reasonably feature-complete. With more than 5000 torrents it is practically
unmanageable during high-load times and nowadays I tend to keep it with
around 2000 torrents per instance, as I can afford as many instances as I need. I
have 28 instances right now, running on consecutive ports for ease of
management.

Software: WM

The software for managing torrents is centered around WM - a Python/Django


web interface, but also contains a number of external tools - cron jobs,
userscripts and the transcoder. To give you a general overview of what it is, I’ll
show you a couple of screenshots of the main pages of WM.

https://what.cd/forums.php?action=viewthread&threadid=193134 4/8
9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond. < Elite < Forums :: What.CD

The central goal of WM is to make it easy to add torrents transparently to the


most appropriate client, display the state of torrens in one place regardless of
which client they are in and enable the maintenance of things from a central
place. More than a few clients are a pain to manage by hand and an abstraction
over this is welcome. Some features of WM for torrent management are:

Adding torrents to the client with the least torrents and the ZFS pool with
most free space
Display torrents’ download progress aggregated over all clients
Scan all clients for torrents that have tracker or other errors
Keep a queue of torrents to download. Adding 100 simultaneously isn’t a
good thing for performance, so I can queue as many torrents as I like and
it will download them at a reasonable rate.
Check the integrity of files and torrents: verify a torrent’s data is in the
right place, seeded by exactly one torrent client, no duplicate torrents
exist, etc.
Keep track of my ratio/buffer/uploads much in the same way Ratio
Sherlock does. I just like to have that data with me.

More than a torrent manager

After the main part of WM was done, I had a pretty nice abstraction layer on top
of raw torrent clients: I had the metadata on what.cd torrents, I could download
torrents with a single line having the torrent id and I didn’t care where it went, I
just knew where to find the files. It quickly became apparent, that some nice
things could be done. What I built on top is:

Stream music from my server using a web browser. Just click a button and
start listening to the album wherever you are. Quite nice. The MP3
streaming part was easy, HTML5 has good support nowadays. But hey,
we love lossless, don’t we? It turns out the guys at
https://github.com/audiocogs/ have coded a FLAC decoder in JavaScript,
so using that I can listen to FLAC torrents from my browser as well. Pretty
neat.
Download a zip with the torrent contents: having to SCP files from my
servers quickly became a pain, so I can just download whatever I want
https://what.cd/forums.php?action=viewthread&threadid=193134 5/8
9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond. < Elite < Forums :: What.CD

servers quickly became a pain, so I can just download whatever I want


straight from the browser, even if I’m on some else’s computer.

Integrated into What.CD

At some point I discovered the elegance of user scripts and decided that it will
be most wonderful if I can browse torrents, add them to my server, watch them
download and listen to them, all without ever leaving the beautiful wooden
panels of What. A picture, as they say, is worth a 1000 words:

The add-on interface is minimalistic, but a huge time saver. I almost never have
to go back and forth between file managers, torrent clients, file transfer
managers, players and such.

A transcoding heaven

After deciding that I want to contribute to What.CD by uploading torrents, I


quickly realized I don’t really have a good source for 100% FLACs and I turned
to transcoding. Iterating through a half-baked C# desktop app and a few other
attempts, I arrived at the current solution - a completely automated transcoding
process from start to end. Give it a torrent straight from the What.CD interface,
it will patiently download it, check it for errors or problems, transcode and
upload it back to What.CD. This is not for the faint-hearted though - I have
received a few warnings until I got to the point where the transcoder complies
with What.CD rules.
On a sidenote, I apologize to any mods reading that had to deal with the mess the
transcoder was creating. You guys are awesome, keep up the good work.
What.CD has that special feeling thanks to you.
Right now, it is implemented as a process that runs Python and interfaces with
the website and database to run the trascode jobs. It uses the flac and lame
binaries to decode/encode, potentially changing the sample rate/bit depth with
sox if needed. There’s a lot of code checking for potential issues like missing
tags, source encoding errors (a suprisingly high number of What.CD torrents
have encoding errors), bad filenames and whatnot. This wasn’t easy to come up
with and as I have already mentioned, it cost me a few warnings.
I have also experimented with giving access to the transcoder to a few fellow
What.CD members - if they ever encouter a FLAC that is missing an MP3
version that they would like, with a click of a button and a few minutes of
https://what.cd/forums.php?action=viewthread&threadid=193134 6/8
9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond. < Elite < Forums :: What.CD

version that they would like, with a click of a button and a few minutes of
waiting they can enjoy the MP3 version. My server goes in and does the work
for them (the download and upload is happening with my account though, they
just enjoy the option to download MP3s).

The future

I want to do many things with WM. On the horizon is a hardware upgrade to


allow for more hard drives, hence more space. I’m looking at these options for
server cases:

NORCO RPC-4224 - the cheapest of the bunch, but seems somewhat low
quality
45drives Storinator - the commercial implementation of the Backblaze
Storage Pod design. Quite nice and probably the one I’ll end up getting. 45
HDDs in an $800 case is something hard to achieve.
A Supermicro storage chassis - twice the price of the Storage Pod, but you
get pretty decent build quality engineered for datacenters.

I’ll wait until the prices of 6TB drives are reasonable and I’ll add a new RAIDZ
pool with 3 of them.
On the software front, a long term vision I have is a Spotify-like interface based
on What.CD metadata and torrents. The one and all music player interface. But it
turns out that’s not very easy to do, so I don’t know when I’ll get around to
implement it. Of course, WM also requires little tweaks and additions, which
also consumes from my free time.
I’m also looking at creating a similar looking interface for e-books from a
popular private e-book tracker. Ideally, the two things will be integrated in one
interface.

Conclusion

I feel like I’ve given a pretty good overview of how things are done and how
WM is making downloading, managing and listening to music super easy. If
you’re interested to hear more on any of the topics covered or if I missed
something you want me to write about, let me know.
What do you think?

P.S. My thanks go to Fawk, my interviewer and Kenny_AF, who helped edit and
improve this post and beta-tested the transcoder extensively.

Edit: You can have WM now!


https://what.cd/forums.php?action=viewthread&threadid=193134 7/8
9/5/2014 WhatManager: a geek's way of automating torrenting and the road to 40`000 torrents and beyond. < Elite < Forums :: What.CD

Due to popular demand, I have open sourced WM and now it's available on
GitHub along with some awesome installation instructions. If you feel your
linux-fu is strong, head over to the wiki at
https://github.com/karamanolev/WhatManager2/wiki and get started!

Edit 2: Bibliotik integration

As the next step of WM's evolution, I've added integration with Bibliotik that
supports all existing features: monitoring torrents, userscript integration straight
into Bibliotik, etc.

Last edited by karamanolev 3 days, 19 hours ago

Forums > Elite > WhatManager: a geek's way of automating torrenting and the road to 40`000
torrents and beyond.

https://what.cd/forums.php?action=viewthread&threadid=193134 8/8

You might also like