DEVOPS USE CASE IN ETSY

        DEVOPS USE CASE IN ETSY

What is Devops?

     Devops is a methodology that brings the "Dev" and "Ops" term together.





About Etsy:

             


                      Etsy
 is an American e-commerce company with an emphasis on the selling of
 handmade or vintage items and craft supplies. These items fall under a wide range of categories, including jewelry, bags, clothing, home décor, religious items and furniture, toys, art, as well as craft supplies and tools. Items described as vintage must be at least 20 years old.

problems faced by Etsy:

  • Deployment Issues:
              Etsy experienced frequent deployment failures and long deployment times. Their deployment process was manual and error-prone, leading to downtime and service disruptions.

  • Slow Release Cycle:
                The traditional release cycle was slow, which hindered their ability to quickly deliver new features and updates to customers. This slow cycle also made it difficult to respond promptly to market changes and customer feedback.

  • Lack of Collaboration: 
                 There was a lack of collaboration and communication between the development and operations teams. This siloed structure led to misunderstandings and inefficiencies in the development and deployment processes.

  • Limited Automation:
         Many processes were manual, including testing, integration, and deployment. This lack of automation resulted in human errors, longer lead times, and higher operational costs.

Implementing Devops:
             


  • Continuous integration and continuous delivery (CI/CD):

                  At Etsy, CI is the essential process of integrating new code with a “master” branch frequently throughout the day. Here, CI systems were usually allowed to automatically run a series of tests upon merging the latest changes to ensure that the integrations were successful.

  • Try:

                 Etsy came up with Try, a library that allows developers to test their changes in Jenkins without having to commit to trunk. This tool is central to Etsy’s continuous integration process. Try is responsible for keeping the trunk clean and deployable while enabling developers to quickly and reliably test their changes. In 2011, after Etsy introduced Try to the team, the number of deploys increased to more than 20 deploys a day and more in the future.

  • Deployinator:

                 Etsy’s team created Deployinator – a one-button web-based deployment app to make code deployment as easy and painless as possible. With the help of Deployinator, Etsy just needed one person to push any amount of update in just under two minutes. Before implementing DevOps, it required a minimum of three developer engineers, one operation engineer, and any production engineer on standby. Deployinator did a lot of heavy lifting for Etsy and is truly at the core of the company’s development and deployment model. In 2015, the company announced the re-release of Deployinator as an open-source Ruby gem.

  • Automated testing:

                Continuous deployment allows Etsy to test various scenarios continuously. After investigating a few methods, including O’Brien-Fleming, Pocock, and sequential testing, Etsy ultimately settled on the latter. And so, using the difference in successful observations, the team looked at the raw difference between the old version and the new. This method worked well for detecting small changes quickly.

  • Continuous monitoring:

            Etsy spends a lot of time gathering metrics for all its processes. The development team conducted at least 14000 tests per day. Also, tracking each deployment allowed them to detect any bugs they could have missed quickly.

               Monitoring is how Etsy’s team builds confidence in their CI/CD processes. The company used various monitoring tools like Nagios, StatsD, Graphite, and Ganglia to correlate issues that arise across its architecture. For instance, in 2009, Etsy started using Graphite for monitoring application-level metrics for new registrations, items sold, images uploaded, shopping carts, forum posts, and application errors.

  • Communication:

            Etsy uses IRC (Internet Relay Chat), one of the most flexible communication mediums, to carry out various collaborative tasks. For example, there is a channel to organize programmers who are trying to deploy at any given time. “Push” is used to create a queue for the operation teams to deploy in groups. The first person in each deploy group is responsible for deploying the code to Princess.


Result of implementing DevOps in Etsy:

  • Faster Recovery Times:

                In case of failures, DevOps practices such as automated rollbacks and continuous monitoring help Etsy to quickly identify and resolve issues, reducing the mean time to recovery (MTTR).

  • Enhanced System Reliability: 

             With DevOps, Etsy can use automated testing, monitoring, and infrastructure as code (IaC). These practices ensure that systems are reliable, scalable, and can handle increased traffic without downtime, enhancing the user experience.

  • Increased Deployment Frequency:

             DevOps practices enable continuous integration and continuous delivery (CI/CD), allowing Etsy to deploy code multiple times a day. This rapid deployment capability allows Etsy to quickly release new features, fix bugs, and improve performance.

  • Enhanced Collaboration:

              DevOps fostered better collaboration between development and operations teams. The shared responsibility for the system’s health and performance broke down silos, leading to improved communication and cooperation.

Before and After using DevOps:


            Before DevOps                        After DevOps
Deployment Frequency Infrequent, larger deployments (possibly weekly or monthly).Multiple deployments per day, increasing agility and responsiveness.
Deployment Failure RateHigher failure rates due to manual processes and lack of automation.Significantly reduced failure rates through automated testing, continuous integration (CI), and continuous deployment (CD) practices
Mean Time to Recovery (MTTR) Longer recovery times due to slower troubleshooting and rollback processes. Faster recovery times with improved monitoring, automated rollbacks, and better incident response strategies.
ScalabilityStruggled with scaling infrastructure efficiently.Enhanced scalability with automated infrastructure management and better resource utilization
Customer Satisfaction Potentially lower satisfaction due to slower updates and frequent service interruptions.Higher customer satisfaction due to faster delivery of new features, more stable services, and quicker issue resolution.
Lead Time for ChangesLong lead times from code commit to deployment (often weeks or months).Drastically reduced lead times (sometimes down to hours), enabling rapid feature delivery and bug fixes
System Reliability and Uptime More frequent downtimes and performance issues.Improved uptime and system reliability due to proactive monitoring and quicker issue resolution

Comments