«Ho steso corde da campanile a campanile; ghirlande da finestra a finestra; catene d'oro da stella a stella, e danzo.» -- Arthur Rimbaud /Colonna sonora: ``Gaznevada - Going Underground''/ ``Why Does the Cloud Stop Computing? - Lessons from Hundreds of Service Outages'', Haryadi S. Gunawi et al. ----------------------------------------------------------------------------------------------------------- «We conducted a cloud outage study (COS) of 32 popular In- ternet services. We analyzed 1247 headline news and public post-mortem reports that detail 597 unplanned outages that occurred within a 7-year span from 2009 to 2015. We ana- lyzed outage duration, root causes, impacts, and fix proce- dures. This study reveals the broader availability landscape of modern cloud services and provides answers to why out- ages still take place even with pervasive redundancies.» Articolo molto completo, ben scritto e molto leggibile che analizza in maniera sia quantitativa che qualitativa gli `outage' dei vari servizi "cloud". Una possibile (e personale) lettura qui di seguito... ma attenzione! Vi consiglio di leggere prima l'articolo e poi - eventualmente - leggere qui sotto (ovvero, c'è il rischio di spoiler!). Dopo un'introduzione rapida al cloud computing e alla sua rilevanza degli ultimi anni si passa agli `outage' e ai danni che essi provocano, per poi formulare le domande che nel resto dell'articolo troveranno delle risposte: «How often and how long do outages typically hap- pen and last across a wide range of Internet services? How many services do not reach 99% (or 99.9%) availability? Do outages happen more in mature or young services? What are the common root causes that plague a wide range of ser- vice deployments? What are the common lessons that can be gained from various outages?» Viene poi introdotta la metodologia con cui è stato creato il Cloud Outage Study database (CosDB): analizzando 1247 `headline news and public post-mortem reports' su 597 `unplanned outages' avvenuti durante l'arco di tempo di 7 anni (2009-2015). I vari "outage metadata" vengono estratti manualmente per poi popolare il CosDB. ...ma le ridondanze, o più in generale il principio No-SPOF (no single point of failure) non dovrebbero servire ad evitare gli `outage'?: «This broad study also raises a perplexing question: even with pervasive redundancies, why do outages still take place? That is, as the principle of no single point of failure (No-SPoF) via redundancies has been preached extensively, redundant components are deployed pervasively in many levels of hardware and software stack. Yet, outages are still inevitable. Is there another "hidden" single point of failure? Studying hundreds of outages in tens of services reveal a common thread. We find that the No-SPOF principle is not merely about redundancies, but also about the perfection of failure recovery chain: complete failure detection, flawless failover code, and working backup components. Although this recovery chain sounds straightforward, we observe nu- merous outages caused by an imperfection in one of the steps. We find cases of missing or incorrect failure detec- tion that do not activate failover mechanisms, buggy failover code that cannot transfer control to backup systems, and cascading bugs and coincidental multiple failures that cause backup systems to also fail.» Nella 2a sezione viene descritta in dettaglio la metodologia usata per raccogliere i dati, come vengono definiti i meta dati (i.e. `Root causes', `Impacts', `Fixes', `Downtime', `Type' e `Scope' e rispettive tag di (sotto-)classificazione). Nella 3a sezione viene analizzata la disponibilità dei servizi in modo quantitativo, analizzando inoltre se la maturità del servizio incide o meno su ciò. La 4a sezione è dedicata alla localizzazione dei SPOF (single point of failure): «We find that the No-SPOF principle is not merely about hardware redundancies, but also requires the perfection of failure recovery chain: complete failure/anomaly detection, flawless failover code, and working backup components. Each of these elements ideally must be flawless. Yet, many of the outages we study are rooted by some flaws within this chain as we elaborate below.» Nella 5a sezione si discutono le `Root Causes' sia in maniera quantitativa che qualitativa. Per ragioni di spazio non riporto le parti salienti tuttavia ci sono numerose osservazioni interessanti. Nella 6a sezione vengono descritti gli impatti e le procedure per risolvere il problema (se presenti!). La 7a sezione riflette sui vantaggi e svantaggi della metodologia usata per poi confrontarla nell'8a sezione con i `Related Work'. La 9a sezione si conclude con: «A big challenge lies ahead: features and failures are rac- ing with each other. As users are hungry for new advanced features, services are developed in a much rapid pace com- pared to the past. As a ramification, the complexity of cloud hardware and software ecosystem has outpaced existing test- ing, debugging, and verification tools. We hope our study can be valuable to cloud developers, operators, and users.» ``Artificial Intelligence Safety and Cybersecurity: a Timeline of AI Failures'', Roman V. Yampolskiy et al. ----------------------------------------------------------------------------------------------------------- «In this work, we present and analyze reported failures of artificially intelligent systems and extrapolate our analysis to future AIs. We suggest that both the frequency and the seriousness of future AI failures will steadily increase. AI Safety can be improved based on ideas developed by cybersecurity experts. For narrow AIs safety failures are at the same, moderate, level of criticality as in cybersecurity, however for general AI, failures have a fundamentally different impact. A single failure of a superintelligent system may cause a catastrophic event without a chance for recovery. The goal of cybersecurity is to reduce the number of successful attacks on the system; the goal of AI Safety is to make sure zero attacks succeed in bypassing the safety mechanisms. Unfortunately, such a level of performance is unachievable. Every security system will eventually fail; there is no such thing as a 100% secure system.» Articolo molto interessante riguardo all'`AI Safety' che IMHO racchiude diverse riflessioni. Includo anche una mia lettura ma se vi interessa l'argomento consiglio di leggere prima l'articolo! (ovvero, se continuate a leggere qui rischiate di imbattervi in uno spoiler!) Dopo una panoramica dei recenti sviluppi dell'IA e alcune previsioni a riguardo di Ray Kurzweil (potrebbero essere discutibili!) si analizzano i danni possibili di un'AI malevola. Una prima riflessione interessante è riguardo alla scarsa attenzione riguardo allo studio e analisi di AI malevole confrontandola a ciò che invece accade da decenni nella sicurezza informatica: «The authors observe that cybersecurity research involves publishing papers about malicious exploits as much as publishing information on how to design tools to protect cyber-infrastructure. It is this information exchange between hackers and security experts that results in a well-balanced cyber-ecosystem. In the domain of AI Safety Engineering, hundreds of papers [3] have been published on different proposals geared at the creation of a safe machine, yet nothing else has been published on how to design a malevolent machine.» L'articolo continua con un elenco cronologico di diversi eventi in cui l'AI (o meglio la NAI, `Narrow Artificial Intelligence') non è stata molto `I'. Viene poi richiamato il concetto di `Artificial Intelligence Safety Engineering' per poi passare al `Teorema fondamentale della sicurezza' parafrasando alcune riflessioni di Bruce Shneir e Salman Rushdie: «Bruce Schneier has said, "If you think technology can solve your security problems then you don't understand the problems and you don't understand the technology". Salman Rushdie made a more general statement: "There is no such thing as perfect security, only varying levels of insecurity" We propose what we call the Fundamental Theorem of Security - Every security system will eventually fail; there is no such thing as a 100% secure system. If your security system has not failed, just wait longer.» Nelle conclusioni sono presenti ulteriori riflessioni interessanti. «Fully autonomous machines can never be assumed to be safe. The difficulty of the problem is not that one particular step on the road to friendly AI is hard and once we solve it we are done. All of the steps on the path are simply impossible. First, human values are inconsistent and dynamic and so can not be understood and subsequently programmed into a machine.» Si riflette poi sull'AGI (`Artificial General Intelligence') e riguardo ad un possibile `digital upload' delle menti, che non dovrebbero considerarsi più umane: «We are as concerned about digital uploads of human minds as about AIs. In the most common case (with an absent body), most typical human feelings (hungry, thirsty, tired etc.) will not be preserved, creating a new type of agent. People are mostly defined by their physiological needs (Maslow's Hierarchy of Needs). An entity with no such needs (or with such needs satisfied by virtual/simulated abundant resources), will not be human and will not want the same things as a human. Someone who is no longer subject to human weaknesses or relatively limited intelligence may lose all allegiances to humanity since they would no longer be a part of it. Consequently, we define "humanity" as comprised of standard/unaltered humans. Anything superior is no longer a human, just like we are no longer Homo Erectus, but Homo Sapiens.» Chiudendo poi con l'analogia delle scritture religiose e la recente ricerca sulla sicurezza e etica dell'AI: «God, the original designer of biological robots, faced a similar Control Problem with people, and one can find remarkable parallels between concepts described in religious books and the latest research in AI safety and machine morals. For example: 10 commandments ≈ 3 laws of robots, second coming ≈ singularity, physical worlds ≈ AI-Box, free will ≈ non-deterministic algorithm, angels ≈ friendly AI, religion ≈ machine ethics, purpose of life ≈ terminal goals, souls ≈ uploads, etc. However, it is not obvious if god ≈ superintelligence or if god ≈ programmer in this metaphor. Depending on how we answer this question the problem may be even harder compared to what theologians had to deal with for millennia. The real problem might be "how do you control God?" And the answer might be - "we can't".» ``Ransomware Took San Francisco's Public Transit for a Ride'', Jamie Condliffe ----------------------------------------------------------------------------------------------------- TLDR: «Hackers forced the light rail network to let passengers ride free to avoid a massive disruption to service. The San Francisco Municipal Transportation Agency was taken for a ride of its own when hackers used ransomware to shut down its ticketing systems and demand payment. The agency-usually known as Muni-found that around 2,000 of its servers and computers, including many ticket machines, were locked by ransomware over the Thanksgiving weekend.» ``Functional Programming & Haskell'', Computerphile --------------------------------------------------- Intervista a John Hughes! ``IoT Botnets Are Growing - and Up for Hire'', Jamie Condliffe ------------------------------------------------------------------------------------ BaaS: botnet as a service? ``Unikernels: The Rise of the Virtual Library Operating System'', Anil Madhavapeddy, David J. Scott --------------------------------------------------------------------------------------------------- Uno sguardo agli unikernel, in particolare MirageOS. ``Awk - A Pattern Scanning and Processing Language'', Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger ------------------------------------------------------------------------------------------------------------ «Awk is a programming language whose basic operation is to search a set of files for patterns, and to perform specified actions upon lines or fields of lines which contain instances of those patterns. [...] This report contains a user's guide, a discussion of the design and implementation of awk, and some timing statistics.»