000867256 000__ 03277cam\a2200517Ii\4500 000867256 001__ 867256 000867256 005__ 20210515163054.0 000867256 006__ m\\\\\o\\d\\\\\\\\ 000867256 007__ cr\cn\nnnunnun 000867256 008__ 190521s2016\\\\caua\\\\ob\\\\001\0\eng\d 000867256 019__ $$a945736622 000867256 020__ $$a9781491951187$$q(electronic book) 000867256 020__ $$a1491951184$$q(electronic book) 000867256 020__ $$a9781491951170$$q(electronic book) 000867256 020__ $$a1491951176$$q(electronic book) 000867256 020__ $$z9781491929124 000867256 035__ $$a(OCoLC)ocn945577030 000867256 035__ $$a(OCoLC)945577030$$z(OCoLC)945736622 000867256 035__ $$a867256 000867256 040__ $$aN$T$$beng$$erda$$epn$$cN$T$$dYDXCP$$dN$T$$dIDEBK$$dTEFOD$$dUMI$$dOCLCF$$dOCC$$dCDX$$dKSU$$dDEBSZ$$dDEBBG$$dEBLCP$$dOCLCQ$$dCOO$$dHCO$$dUOK$$dCEF$$dNTG$$dWYU$$dC6I$$dUAB$$dUKAHL 000867256 049__ $$aISEA 000867256 050_4 $$aHD9696.8.U64$$bG6666 2016eb 000867256 08204 $$a620.00452$$223 000867256 24500 $$aSite reliability engineering :$$bHow Google runs production systems /$$cedited by Betsy Beyer ... and others. 000867256 264_1 $$aSebastopol, CA :$$bO'Reilly Media,$$c2016. 000867256 300__ $$a1 online resource (xxiv, 524 pages) :$$billustrations 000867256 336__ $$atext$$btxt$$2rdacontent 000867256 337__ $$acomputer$$bc$$2rdamedia 000867256 338__ $$aonline resource$$bcr$$2rdacarrier 000867256 504__ $$aIncludes bibliographical references and index. 000867256 5050_ $$aIntroduction. The production environment at Google, from the viewpoint of an SRE -- Principles. Embracing risk -- Service level objectives -- Eliminating toil -- Monitoring distributed systems -- The evolution of automation at Google -- Release engineering -- Simplicity -- Practices. Practical alerting from time-series data -- Being on-call -- Effective troubleshooting -- Emergency response -- Managing incidents -- Postmortem culture: learning from failure -- Tracking outages -- Testing for reliability -- Software engineering in SRE -- Load balancing at the frontend -- Load balancing in the datacenter -- Handling overload -- Addressing cascading failures -- Managing critical state: distributed consensus for reliability -- Distributed periodic scheduling with Cron --Data processing pipelines -- Date integrity: what you read is what your wrote -- Reliable product launches at scale -- Management. Accelerating SREs to on-call and beyond -- Dealing with interrupts -- Embedding an SRE to recover from operational overload -- Communication and collaboration in SRE -- The evolving SRE engagement model -- Conclusions. Lessons learned from other industries. 000867256 506__ $$aAccess limited to authorized users. 000867256 588__ $$aDescription based on print version record. 000867256 61020 $$aGoogle (Firm) 000867256 650_0 $$aReliability (Engineering) 000867256 650_0 $$aSystems engineering$$xManagement. 000867256 650_0 $$aInternet industry$$zUnited States$$xManagement. 000867256 7001_ $$aBeyer, Betsy,$$eeditor. 000867256 7001_ $$aJones, Chris$$c(Computer engineer),$$eeditor. 000867256 7001_ $$aPetoff, Jennifer,$$eeditor. 000867256 7001_ $$aMurphy, Niall Richard,$$eeditor. 000867256 77608 $$iPrint version:$$aSite reliability engineering.$$dSebastopol, CA : O'Reilly, 2016$$z9781491929124$$w(OCoLC)930683030 000867256 852__ $$bcoll 000867256 85280 $$bebk$$hEBSCOhost 000867256 85640 $$3eBooks on EBSCOhost$$uhttps://univsouthin.idm.oclc.org/login?url=http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=1204854$$zOnline Access 000867256 909CO $$ooai:library.usi.edu:867256$$pGLOBAL_SET 000867256 980__ $$aEBOOK 000867256 980__ $$aBIB 000867256 982__ $$aEbook 000867256 983__ $$aOnline 000867256 994__ $$a92$$bISE