Want to have a positive impact in the world? Ana Medina has powerful advice to share with all girls that want to know if tech is right for them. Her story started with a simple spark, and this may be YOURS! Be sure to share this motivational video with all your friends and family.
Chaos Engineering lets you compare what you think will happen to what actually happens in your systems. You literally break things on purpose to learn how to build more reliable systems. Lenny Sharpe walks you through Chaos Engineering at Target, covering the tools and practices you need to implement Chaos Engineering with Kubernetes in your organization. Even if you’re already using Chaos Engineering, you’ll learn to identify new ways to use the practice to improve the reliability of your network and services. Ana Medina will share a demonstration of how you can practice Chaos Engineering on Kubernetes and use it to improve the reliability of your systems.
Our panel of SRE experts, experienced in SRE roles, reflect on how SRE has changed, how to adapt SRE to the unique aspects of each organization and how software and tools continue to evolve to meet the changing needs of SREs. Mitch Ashley, Techstrong Group, and Austin Parker, Lightstep, host an engaging conversation with our panel of recognized experts while engaging with our live online audience
How can we advocate for more responsible technology that creates opportunities for all communities to partake in the socio-economic benefits of the innovation of this generation. This conversation is at the intersection of technology + community building, and ethical decisions throughout our careers.
Keptn integrates with most of the great projects in the CNCF landscape (and beyond) that help teams to deliver and operate their cloud native workloads. The integration happens through open event standards (CloudEvents, CDEvents) which is why Keptn makes it easy to connect tools and orchestrate the application lifecycle regardless of your toolchains. What could make Keptn better? Upstreaming and generalizing the best of it!
How does your team prepare for failure and learn from incidents? GameDays are a time to come together as a team and organization to explore failure and learn. This practice has been done across most industries, from fire departments to technology companies.
Ana Margarita Medina, a Staff Developer Advocate at Lightstep and Darko talk all things about SRE (Site Reliability Engineering) and DevOps. They discuss the finer topics of both and the differences between them.
As engineers we expect our systems and applications to be reliable. And we often test to ensure that at a small scale or in development. But when you scale up and your infrastructure footprint increases, the assumption that conditions will remain stable is wrong. Reliability at scale does not mean eliminating failure; failure is inevitable. How can we get ahead of these failures and ensure we do it in a continuous way?
In this podcast, Ana Medina, senior chaos engineer at Gremlin, sat down with InfoQ podcast co-host Daniel Bryant. Topics discussed included: how enterprise organisations are adopting chaos engineering with the requirements for guardrails and the need for “status checks” to ensure pre-experiment system health; how to run game days or IT fire drills when everyone is working remotely; and why teams should continually invest in learning from past incidents and preparing for inevitable failures within systems.
We might have just spent the last year mastering the art of having the perfect flower field around our home, keeping the weeds out of our island, or just trying to build relationships with our neighbors. The skills and the lessons we’ve mastered through building our islands can also help us in real life, from staying connected to building stronger engineering teams and applications. Let’s take a moment to see the similarities of the work we’ve done on our islands, ourselves, workplaces, and celebrate all we’ve learned.
You may have heard of the buzzwords “chaos engineering” and “containers.” But what do they have to do with each other? In this session, we introduce chaos engineering and share a live demo of how to practice chaos engineering principles on AWS. We walk through chaos engineering practices, tools, and success metrics you can use to inject failures in order to make your systems more reliable..
Chaos engineering is a disciplined approach to identifying failures before they become outages. By proactively testing how a system responds under stress, you can identify and fix failures before they end up in the news. Chaos engineering lets you compare what you think will happen to what actually happens in your systems. Ana shares how you can use this practice before you go multi-cloud, when using multiple clouds and other use cases that can make your infrastructure and teams more resilient.
"Cómo nos preparamos para enfrentar la pérdida de nuestro centro de datos (datacenter)? Cómo nos preparamos para estar de guardia o disponible para asistir cuando estamos on-call?