Choosing between baked and fried provisioning

Eggs to be baked or fried, like provisioning

Provisioning always requires resources from somewhere. The resources are packages in remote repositories, compressed files from Internet addresses, they have all sizes and formats. Depending on where they are and the available bandwidth, the download process can last more than expected. If provisioning is a repetitive task, like in automated tests, you might want to use baked images, in order to save time.

Baked images

Baked images are previously prepared with software and configuration. For this reason, they are usually bigger than the ones used in fried provisioning.  In order to maintain a baked images repository, storage is really a point of consideration, mainly if the images are versioned. Downloading and uploading baked images has also its cost, so it’s better minimizing it as much as possible.

Analogously to baked eggs, baked images are ready to be consumed, there’s no need of adding something special. For sure it requires some effort in advance, but it pays off if you have to use a virtual machine right away.

Baked images also empower the use of immutable servers, because most of the time they don’t require extra intervention after instantiation. In addition, if something goes wrong with the image instance, it’s better recreate it, rather than repair it. That makes baked images preferable to be used in autoscaling, once they are rapidly instantiated and ready.

Fried provisioning

On the other hand, fried provisioning is based on raw images, usually with just the operating system installed. These lightweight images, once instantiated, must be provisioned with all the required software and configuration, in order to be at the ready-to-use state. Analogously to fried eggs, you must follow the recipe and combine all the ingredients to the point they are ready to be consumed.

One concern about fried provisioning, when it is executed repeatedly, is avoid breaking it. During the process, a package manager, like apt, is usually used to install the required softwares. Unless you are specific on what version the package manager must install, the latest one will be installed. Unexpected behaviors can happen with untested newest versions, including a break in the provisioning process. For that reason, always be specific on what version must be installed.

Codeyourinfra provisioning options

Since the version 1.4.0 of the Codeyourinfra project on Github, the development environment can be initialized with both provisioning options: fried, the default, and baked. It means that the original base image, a minimized version of a Vagrant box with Ubuntu 14.04.3 LTS, can now be replaced by a baked one. The baked images are available at Vagrant Cloud, and can be downloaded not only by those who want to use the Codeyourinfra’s development environment, but also by the ones who want an image ready to use.

It’s quite simple choosing one provisioning option or the other. If you want to use the baked image, set the environment variable PROVISIONING_OPTION to baked, otherwise let it unset, because the fried option is the default, or specify the environment variable as fried.

Baking the image

The process of building the baked images was simple. I could have used a tool like Packer for automating it, but I manually followed this steps:

1.  vagrant up <machine>, where <machine> is the name of the VM defined in the Vagrantfile. The VM is then initialized from the minimal/trusty64 Vagrant box and provisioned by Ansible.

2. vagrant ssh <machine>, in order to connect to the VM through SSH. The user is vagrant. The VM is ready to use, the perfect moment to take a snapshot of the image. Before that, in order to get a smaller image, it’s recommended freeing up disk space:

sudo apt-get clean
sudo dd if=/dev/zero of=/EMPTY bs=1M
sudo rm -f /EMPTY
cat /dev/null > ~/.bash_history && history -c && exit

3. vagrant package <machine> –output, for finally creating the baked image file, which was then uploaded to the Vagrant Cloud.

The initialization duration

Vagrant by default does not show, along the command’s output, a timestamp in each step executed. Hence you are not able to easily know how long the environment initialization takes. In order to overcome this limitation, another environment variable was introduced: APPEND_TIMESTAMP. If it is set to true, the current datetime is prepended in every output line, so you can measure the initialization duration.

Each Vagrantfile, when executed, now loads right in the beginning the Ruby code below, that overrides the default Vagrant output behavior if the APPEND_TIMESTAMP flag is turned on. Actually, Vagrant has already an issue on Github addressing such enhancement, where this code was presented as a turnaround solution.

append_timestamp = ENV['APPEND_TIMESTAMP'] || 'false'
if append_timestamp != 'true' && append_timestamp != 'false'
  puts 'APPEND_TIMESTAMP must be \'true\' or \'false\'.'
if append_timestamp == 'true'
  def $stdout.write string
    if log_datas.gsub(/\r?\n/, "") != ''"%d/%m/%Y %T")+" "+log_datas.gsub(/\r\n/, "\n")
    super log_datas
  def $stderr.write string
    if log_datas.gsub(/\r?\n/, "") != ''"%d/%m/%Y %T")+" "+log_datas.gsub(/\r\n/, "\n")
    super log_datas

Feel free to experiment the provisioning options along with the timestamp appending flag set to true! You now have a better environment to try the Codeyourinfra project solutions.

And don’t forget to tell me your problem! For sure we can find a solution together 🙂



One Reply to “Choosing between baked and fried provisioning”

Comments are closed.