FlawedLogic

Genweb: 5 years

Today the Genweb project turns 5. I’m aware that I should have done this write up much more earlier because a formal introduction of the project to the community was never done due to a lot of reasons. I really think I’m in debt with the Plone community and I want to return some of the vast amount of knowledge the community provided to me. Now I think that the better way to do it is to share the knowledge from this 5 years with Genweb.

Genweb (as its name hints) is a institutional web generator. The project was born as an initiative from the Communication and Promotion Service at the BarcelonaTech University (UPC) and sponsored by the university itself. Genweb was created with two goals in mind: Provide the university community a content management tool suitable to all audiences (speaking in terms of usability) and unify the look and feel of the university web sites.

Genweb, “the infrastructure”

I will talk in later posts about Genweb “the application”, suffice it to say that through the years it was Plone 2 and 3 based and now it’s Plone 4.1 based coupled with some the most popular Plone add-ons (Dexterity, PloneFormGen, Collage, etc…) and some additional tweaks for further improve the already awesome Plone usability.

Today I want to focus in Genweb “the infrastructure” which I think is by far our best accomplishment. During the project’s 5 years of life you could say it’s been quite a success. The project is currently hosting almost 400 Plone sites, which it’s a growth rate of 80 site per year. Design the infrastructure for hosting 400 plone sites is a big challenge, so we had to make it evolve constantly and adapt it to its constant growing requirements. It started from a single physical server to the six physical server architecture that host the full system at present.

This is the actual hardware architecture of the system today:

It’s a six physical machine based system. The system specs of each server are different, depending on their role. For example, each frontend has 32GB RAM and dual quad-core CPUs. With such horse power we’ve designed an architecture that is capable to take full advantage of the machines. We have split the service in 12 twin environments.

These are the components of each environment:

Requests for each environment has the same entry point at the Pre-frontend machine which host the HTTP cache accelerators (Varnish) processes and the routing and load balancing (HAProxy) to the Frontend servers (A, B and C).

The request is processed (if not cached) by one of the tree Zope Clients. Each environment has three Zope clients each of them assigned to one Frontend machine. With this setup we have the pipeline balanced and fault tolerant at the frontend level.

The Zope clients attack a ZEO server located in one of the Backend servers. This ZEO has 35 Plone sites (Genwebs instances) at most. Each instance has its own ZODB mountpoint in its ZEO server for easy management (backup and restore procedures), easy file size handling and mental sanity. This ends up in having a ZODB database for each instance.

Design decisions, drawbacks and solutions

As I said, we had to adapt the system architecture through the years. We wanted to keep it bleeding edge, by implementing some of the most recent additions and features as they emerged from the community. Genweb implemented the first beta versions of plone.app.blob, this helped us to lower the sites database size by storing the images and files out of the ZODB.

Having a ZODB for each instance was a good decision, but we’ve had to face an important drawback. The system begun to lost connections due to reach the file descriptor (FD) limit of the backend because each ZODB mountpoint uses 4 sockets to connect with each Zope client. We had three clients accessing to each mountpoint, so this means that the limit of 1024 FDs by process was easily reached. After changing the libc6 hard limit, a recompilation of Python and Zope and change the default system FD limit the problem was solved. However, this was only valid for a short space of time, because we found that the ZEO server process have a lot of trouble in handling great numbers of open sockets. After some testing we found that staying arround 35 mountpoints for each ZEO server was the safest and best performance setup.

So we split the whole setup in 12 twin enviroments with 35 instances each at most.

Tools

We used buildout since the very beginning. We used a script to equally distribute the instances between enviroments. Since the split, we formalized this script converting it to a buildout recipe.

The recipe basically extends the zope.conf and zeo.conf files of each Zope client and ZEO server with a list of instances for each of them. This list can be a remote endpoint via http or a file in the filesystem.

These are the production buildouts of the project:

Probably, each of these tools itself won’t be useful for direct use for you because they are very specific for our use case, but you can extract some pieces of them for your own profit.

Summary and some numbers

As a conclusion, we are currently serving more than 3M pages/month with a transfer of 200GB/month. There are 12 ZEO servers distributed between our two Backends servers (6 each) with 400 ZODB mountpoints with its corresponding blobs, 36 Zope clients (12 Zope clients each Frontend) and 12 Varnish and HAProxy pairs in the Pre-frontends serving requests for 400 Plone sites.

Happy 5th anniversary Genweb!!!

Comments

About frameworks and libraries

I have detected lately some misunderstanding when people refers to different kind of software. The word Framework is honestly, largely overloaded. The confusion often reaches to the point of mistake a well known standard for a framework.

So, I’ll try to shed some light to this subject.

Obviously, let’s say we have an application that uses a framework or a library. I wanted to remark this, because a framework is nothing by itself and accomplish nothing. It seems a naive statement but it’s important when comparing a framework with an application (e.g. a content management system, CMS) that uses a framework for accomplish a task.

In essence, a library is a module, self contained piece of software that you call from your application code to accomplish a specific, well-defined operations. It usually exposes an API and its features are well documented.

On the other side, a framework calls your application code which is in fact, the meat that the framework needs to accomplish something specific. A framework embodies some abstract design, with more behavior built in. In order to use it you need to insert your behavior into various places in the framework either by subclassing or by plugging in your own classes. The framework’s code then calls your code at these points.

A framework may include libraries, support programs, a scripting language, or other software to help develop and glue together the different components of a software project. It usually packs the means to work with some software paradigms, for example model-view-controller (MVC) and provides the software components for accessing persistence layer, creating and manage the business logic and a template language to create views.

Let’s see an example. Any (decent) web framework have classes or methods that embodies the behavior of “a web view”. However, this behavior by itself does nothing because the view is empty. It knows it is a view… but with no purpose. Usually, you need to overload those classes or methods with your code to stuff your purpose and define what your views should accomplish.

Some library examples: jQuery, image manipulators, string utilities, etc.

Some framework examples: Node.js, Pyramid, Zope

One final word… guys, JSON is NOT a framework.

Comments

Speedcubing, Scrum i metodologies àgils

El darrer 7 de maig va tindre lloc a la localitat australiana de Melbourne el Kubaroo Open 2011 de speedcubing. Pels qui no heu sentit mai a parlar d’aquesta activitat, speedcubing és l’acte de resoldre un cub de Rubik el més aviat possible. En aquest campionat, un noi de Melbourne de 15 anys, Feliks Zemdegs va establir un nou récord mundial en la modalitat 3x3x3, parant el crono en 6,24s. Podeu veure el moment en el següent vídeo.

Els speedcubers es basen en la memorització de l’estat del cub en el moment de resoldre’l i en la aplicació de mètodes i regles predefinides i estudiades sobre el cub. Aquests mètodes normalment consten d’una cadena d’accions que resulten en canvis de posició de les peces del cub. Alguns mètodes tenen més de 100 moviments i molts speedcubers memoritzen tots els moviments de varis métodes per tindre un repertori de moviments molt ampli per poder aplicar a fi de resoldre cada cub que se’ls hi presenta.

La fase de desenvolupament d’un projecte utilitzant metodologies àgils i resoldre un cub de Rubik no difereixen en gaire, de fet s’assemblen molt. Totes dues necessiten d’una primera fase d’estudi de la situació i identificació de les necessitats del problema a resoldre. Després, hi ha una fase iterativa en la que s’ataca el problema obtenint resultats parcials però funcionals, operatius i entregables. Es repeteix la iteració fins que s’obté el resultat esperat, la resolució del problema inicial.

Les metodologies àgils es caracteritzen per ser adaptatives a lo llarg del cicle de vida del desenvolupament del projecte. Cal remarcar que no són metodologies que tractin la gestió integral de projectes, si no que miren d’atacar amb efectivitat la fase de desenvolupament del projecte focalitzant-se en la comunicació i la satisfacció general del client. Són especialment apropiats pel desenvolupament de projectes en el que el client no disposa de requeriments gaire acurats ni d’una imatge cent per cent clara del resultat final de la solució. En aquesta situació normalment és molt complicat escriure un document extens dels requeriments del projecte tal i com es faria en cas d’utilitzar metodologies predictives. Els projectes de gestió de continguts web sovint cauen dins d’aquest escenari.

Scrum no és més que una de moltes metodologies àgils que s’han desenvolupat a lo llarg dels últims anys. Bàsicament compleix moltes de les característiques exposades en el Agile Manifesto, document que intenta condensar totes les bondats de les metodologies àgils. Els exposats a continuació són els principals objectius i característiques de les metodologies àgils:

  • Obtenir la satisfacció del client lliurant-li solucions parcials usables al seu problema de manera ràpida
  • Els canvis de requeriments són benvinguts, encara que ens trobem en una fase avançada de desenvolupament
  • El software es lliura de manera freqüent (setmanes i no mesos)
  • La quantitat de software funcionant i amb el beneplàcit del client és la principal mesura de progrés
  • Desenvolupament sostenible, capaç de mantenir un ritme constant
  • Cooperació i treball en grup diari entre les persones integrants de l’equip, tant comercials/gestors de client com desenvolupadors
  • La col·laboració cara a cara és la millor manera de comunicació
  • Els projectes es construeixen al voltant de persones motivades, en les quals recau la responsabilitat de l’èxit del projecte i com a tal, s’ha de confiar en ells
  • Atenció contínua a l’excel.lencia tècnica i el disseny de bon software
  • Simplicitat
  • Equips que s’auto-organitzen
  • L’equip és capaç d’adaptar-se al canvi amb facilitat

Per aconseguir aquests objectius, Scrum proposa un procés base basat en iteracions, l’assignació de rols dins de l’equip i varis artefactes que serviran de suport per tot el procés. Aquestes són algunes de les seves claus:

  • Desenvolupament basat en iteracions del procés Scrum: planificació, desenvolupament, test, demo
  • A principi de cada iteració es decideix amb el client que funcionalitats són prioritàries i es planifiquen amb l’equip. Cada persona de l’equip s’encarrega d’una tasca i no agafa un altre fins que no l’acaba
  • Al final de cada iteració, es fa una demo al client amb el software funcionant fins ara i això li permet tindre petits tasts de la foto final, pot opinar i influir en els requeriments inicials i l’equip es realimenta de la seva opinió i visió del producte final
  • Cada dia es fa una trobada entre tots els membres de l’equip i es resolen tots els problemes o dubtes que han esdevingut en la jornada anterior i s’intenten resoldre el més aviat possible
  • L’equip te l’encàrrec de focalitzar-se en el desenvolupament del projecte i només en això, qualsevol interrupció externa es tracta convenientment per evitar que l’equip es descentri del seu objectiu principal

Es podrien omplir pàgines i pàgines sobre Scrum, però el verdader poder d’Scrum està en els resultats d’aplicar-lo: Satisfacció general del client durant tot el procés de desenvolupament del projecte conjuminat amb alta qualitat del software desenvolupat.

Per cert, el meu rècord en resoldre el cub de Rubik és 3m20s… i el vostre?

Comments

testing highlight

import sys
for a in b:
    print "hello"

Comments

Testing! blablalb lball ballb al lbalblalbl blalbalb

Testing tumblr… Trying to find out if there’s a real blog platform (the way a wordpress-like blog is) alternative.

Comments

Plone 3 Intranets book reviews

Time runs fast, it’s already been two months after the publishing of the book and during this time I’ve received a lot of congrats from family, friends and colleagues. I want to thank you all.

I’ve made a compilation of several reviews that have been published about the book.

Thank you guys for the nice reviews!

Comments

Plone 3 Intranets book

Well, it’s done. After a year of work, Packt Publishing has published my first book: Plone 3 Intranets.

It’s available in paper and PDF e-book format. You can buy it from Packt’s website, Amazon and others. The sample chapter ‘Using Content Type Effectively‘ is available too from Packt.

I consider that I belong to the Plone Community since my first attendance to my first Plone Conference in Seattle (2006). I’ve always wanted to contribute to the Plone community one way or another, however I’ve never found the right project nor the time to spend with a worthy contribution. I’ve been always attracted by the idea of writing a book, and when the opportunity of writing a Plone book came, I saw clearly that this was the right project to put my efforts on. However, I realized soon that writing a book it’s not a joke. It has a lot of work involved, deadlines to accomplish, endless work weekends, revisions, revisions, and more revisions… but the final result was worth the price paid. I want to thank to all the people that help me during this year to gave birth to this book.

Packt offered me to write a book about Plone-based intranets targeted to beginners and users with no previous experience about Plone who want to learn how to design, build, and deploy a reliable, full-featured, and secure intranet easily from scratch. It sounded very familiar to me because I have a great experience building intranets and collaboration sites for the university. I’ve tried to dump all the knowledge and experience I’ve acquired during the last 4 years working actively with Plone.

I hope I’ve succeeded in this effort and that you enjoy this book as much as I’ve enjoyed writing it.

These are some of the key topic covered by the book:

  • Get to grips with installing Plone and all its dependencies
  • Easily set up your Plone site and optimize it to work as an intranet
  • Manage users and groups, and use local and global roles to manage content access
  • Create and modify Plone workflows and learn how to use them
  • Explore the most common security use cases in an intranet and learn how to deal with them in Plone
  • Make effective use of content type and some of the out-of-the-box Plone features to work for your intranet
  • Enhance your intranet with useful add-on products, like corporate blogs, message boards, document preview helpers, and so on
  • Give a fresh, standing-out look to your intranet with attractive themes
  • Deploy your intranet and make your site live

This book is for anyone who needs to build an intranet with no limits on capabilities or features. Even if you don’t have previous CMS experience or programming skills, this book is for you. Targeted at beginners with no previous experience with Plone, this book will teach you step by step and at the end you should have a full-featured, reliable, and secure intranet.

Comments