-->

2023-06-17

Unbricking a Kobo Forma eReader

While the Kobo eReaders have provided pleasure to many people, the devices have some design issues. The user guides and various internet fora give a suspicious amount of attention how to do a soft rest, hard reset, factory reset, paperclip hole reset, etc.

The Kobo Forma device that was presented to me got stuck in a power-on loop in a low-battery state. As a result, the battery could no longer be charged and the device could not be properly powered off. All above instructions on reset methods failed. The Kobo Forma does not have a paperclip reset hole. The device was essentially bricked and was too old to send to Kobo for repair.

So it was time for the last resort: to open the device. WARNING: opening the device will damage the seals that make the device water-tight. The device can be opened by simply bending one of the edges of the rubber backside with slight manual force (as for removing a protective bumper of a smartphone). By chance, I did this in a room with a temperature of about 28 degrees Celsius. In a colder room, you may want to preheat the device to this temperature to make the bending of the rubber and the loosening of the water-tight glue easier. After this, the device looks as follows:


The power leads from the Li-ion battery are directly soldered to the PCB. Because I do not have equipement to handle indium based solder with a high melting point, I decided to simply cut one of the power leads and carry out the following steps:

  • connect the device to the charger so that the power-on cycle can be completed (I forgot this the first time as on the photograph)
  • remove some isolation from both ends of the cut wire
  • wet the clean ends with some fresh solder
  • bend the fresh soldered ends so that they run in parallel just a few millimeters apart (but not have them touch each other yet)
  • heat one of the ends so that the solder melts
  • quickly push, with your finger or some tool, this part of the wire to the other part of the wire while keeping contact between the soldering iron and the two cleans ends until all solder is molten. This step must be done fast and precisely, so that the solder does not oxidize from being heated too long and the power-on cycle of the device is not interrupted by breaking the contact between the wire ends during soldering.
  • Isolate the soldered part with some tape to prevent a future short circuit between the power leads (I suppose the Li-ion battery has some protection to prevent a fire during short circuit, but cannot guarantee this).
In my case, this resulted in a reset device that indicated to just have a 2% battery charge. Have luck with yours!


2020-12-31

Python-friendly dtypes for pyspark dataframes

When using pyspark, most of the JVM core of Apache Spark is hidden to the python user. A notable exception is the DataFrame.dtypes attribute, which contains JVM format string representations of the data types of the DataFrame columns . While for the atomic data types the translation to python data types is trivial, for the composite data types the string representations can quickly become unwieldy (e.g. when using the elasticsearch-hadoop InputFormat).

from pyspark.sql import functions as f, types as t, Row

# Create DataFrame
ints = [1,2,3]
arrays = [[1,2,3], [4,5,6], [7,8,9]]
maps = [{'a':1, 'b':2, 'c':3}]
rows = [Row(x=1, y='1'), Row(x=2, y='2'), Row(x=3, y='3')]
composites = [Row(x=1, y=[1,2], z={'a':1, 'b':2}), Row(x=2, y=[3,4], z={'a':3, 'b':4})]
df = spark.createDataFrame(zip(ints, arrays, maps, rows, composites))

# Show standard dtypes
for x in df.dtypes:
    print(x)

# ('_1', 'bigint')
# ('_2', 'array<bigint>')
# ('_3', 'map<string,bigint>')
# ('_4', 'struct<x:bigint,y:string>')
# ('_5', 'struct<x:bigint,y:array<bigint>,z:map<string,bigint>>')

# Show python types in collected DataFrame
row = df.collect()[0]
for x in row:
    print(type(x))

# <class 'int'>
# <class 'list'>
# <class 'dict'>
# <class 'pyspark.sql.types.Row'>
# <class 'pyspark.sql.types.Row'>

# Show python types passed to a user defined function
def python_type(x):
    return str(type(x))

udf_python_type = f.udf(python_type, t.StringType())
row = df \
    .withColumn('_1', udf_python_type('_1')) \
    .withColumn('_2', udf_python_type('_2')) \
    .withColumn('_3', udf_python_type('_3')) \
    .withColumn('_4', udf_python_type('_4')) \
    .withColumn('_5', udf_python_type('_5')) \
    .collect()[0]
for x in row:
    print(x)

# <class 'int'>
# <class 'list'>
# <class 'dict'>
# <class 'pyspark.sql.types.Row'>
# <class 'pyspark.sql.types.Row'>

While the dtypes attribute shows the data types in terms of the JVM StructType, ArrayType and MapType classes, the python programmer gets to see the corresponding python types when collecting the DataFrame or passing a column to a user defined function.

To fill this gap in type representations, this blog presents a small utility that translates the content of the dtypes attribute to a data structure with string representations of the corresponding python types. The utility can be found as a gist on github, but is also listed below:

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#
import re
import string

import pyparsing


def pysql_dtypes(dtypes):
    """Represents the spark-sql dtypes in terms of python [], {} and Row()
    constructs.
    :param dtypes: [(string, string)] result from pyspark.sql.DataFrame.dtypes
    :return: [(string, string)]
    """

    def assemble(nested):
        cur = 0
        assembled = ''
        while cur < len(nested):
            parts = re.findall(r'[^:,]+', nested[cur])
            if not parts:
                parts = [nested[cur]]
            tail = parts[-1]
            if tail == 'array':
                assembled += nested[cur][:-5] + '['
                assembled += assemble(nested[cur+1])
                assembled += ']'
                cur += 2
            elif tail == 'map':
                assembled += nested[cur][:-3] + '{'
                assembled += assemble(nested[cur+1])
                assembled += '}'
                cur += 2
            elif tail == 'struct':
                assembled += nested[cur][:-6] + 'Row('
                assembled += assemble(nested[cur+1])
                assembled += ')'
                cur += 2
            else:
                assembled += nested[cur]
                cur += 1
        return assembled

    chars = ''.join([x for x in string.printable if x not in ['<', '>']])
    word = pyparsing.Word(chars)
    parens = pyparsing.nestedExpr('<', '>', content=word)
    dtype = word + pyparsing.Optional(parens)

    result = []
    for name, schema in dtypes:
        tree = dtype.parseString(schema).asList()
        pyschema = assemble(tree).replace(',', ', ').replace(',  ', ', ')
        result.append((name, pyschema))
    return result

The pysql_dtypes() function starts with building a simple grammar using the pyparsing package, to parse the dtypes as given by pyspark. Central in the grammar are the special characters '<' and '>' that are used to recognize nested types in the array<>, map<> and struct<> constructs. Note that these characteres cannot occur in JVM or python field names. The pyparsing.nestedExpr()method takes care of any multi-level nesting. Words are defined as arbitrary successions of printable characters with the exception of the angle brackets (because we use the output of DataFrame.dtypes as input, we assume that we will not encounter any weird characters). Finally, a word occurs either at the start of a JVM type representation (in case of so-called atomic types) or within angled brackets.

The assemble() function translates the parsed JVM type representation into corresponding python types and re-assembled them. This function is recursive because the nested expressions can have arbitrary depth. It splits the earlier defined 'words' into parts separated by a ',' or ':' character and then applies a simple recipe for the assembly of the parts for the various possible combinations of the recognized array, map and struct constructs. The gist on github also contains a test suite that provides many sample outputs of the pysql_dtypes() function. The code sample below takes the example used earlier.

from pysql_dtypes import pysql_dtypes

for x in pysql_dtypes(df.dtypes):
    print(x)

#('_1', 'bigint')
#('_2', '[bigint]')
#('_3', '{string, bigint}')
#('_4', 'Row(x:bigint, y:string)')
#('_5', 'Row(x:bigint, y:[bigint], z:{string, bigint})')

The pysql_dtypes()function will be suggested to Apache Spark, so if you would like the utility to become available as part of pyspark, be sure to star the gist on github and add yourself as watcher to the corresponding issue (requires Apache Jira account).



2020-06-11

Proposing authorization for Gremlin Server

Introduction

Gremlin Server is an important part of the Apache TinkerPop ecosystem for supporting various ways of access to graph systems. In particular, Gremlin Server offers client applications based on a Gremlin Language Variant (GLV) remote access using web socket connections. In addition, Gremlin Server caters for various security mechanisms such as authentication, read-only access, subgraph access, timeouts and sandboxing.

While this sounds like a comprehensive suite of features, there is a catch: it is not possible to discriminate between authenticated users in providing or limiting access. In other words, Gremlin Server does not provide authorization.

Recent interest in authorization solutions became apparent in the TinkerPop related user forums:

Some authentication and authorization details

The definitions of authentication (is this the user who he/she claims to be?) and authorization (what is this user allowed to do?) are generally known, but the following characteristics of the underlying processes are not always immediately realized:
  1. authentication is a one-off procedure when logging in to an application, while authorization is required at the start of each new usage context (e.g. a database query or selecting some feature of an application).
  2. authorization is implicit in an application that has a private store of authentication credentials of a limited set of users (like Gremlin Server's SimpleAuthenticator). In this case, the authorization has a binary nature: the user is yes/no allowed to run every possible operation inside the application.
  3. managing authorizations for an instance of an application in a dedicated store is time-consuming and error-prone. Scalable authorization solutions manage accessible resources and allowed operations per user in a centralized store (e.g. LDAP, Apache Ranger, Active Directory).

What to authorize

Before zooming in on specific characteristics of Gremlin Server, it is useful to consider the authorization needs of graph applications in general. Like for relational databases, access to data can be granted at a number of levels. Because authorization schemes in SQL are more developed than in graph systems (the SQL language standard even has the GRANT/REVOKE keywords), we will pursue the analogy a bit further, see the table below.

 Level
 SQL RDBMS          
 graph system
 database GRANT CONNECT
 ON moviedb
 TO users;
 access to subset of graphs
 table GRANT SELECT
 ON titles
 TO users;
 access to subset of vertices and edges, based on their labels
 column GRANT SELECT
 ON titles.director
 TO users;
 access to subset of vertex and edge properties, based on their property keys
 row CREATE POLICY age_policy
 FOR SELECT
 ON titles
 TO child_users 
 USING (age_limit < 18);
 access to vertices and edges in a graph, based on a predicate on property values

Some further comments on this table:
  • The row level security construct is specific to postgresql, but other RDBMSs have similar constructs.
  • Apart from the CONNECT operation at the database level and the SELECT operation at the table and row level, various other operations exist, e.g. for creation/insertion, deletion, etc.
  • The principle of using ACL-like (meta)properties on an object (vertex, edge, property) for column and row level security is already mentioned in the Apache TinkerPop reference documentation.
  • The fact that RDBMSs can be used to implement graph systems, suggests that these authorization types should suffice for a graph system. Of course, there may be some additional possible permissions like the number of results a query may return or the maximum duration of a query.

Impact on Gremlin Server

Gremlin Server already has some functionalities in place that can be leveraged to implement authorization. Gremlin Server exposes a graph to GLV clients in the form of a GraphTraversalSource. A GraphTraversalSource can have so-called strategies applied to it, i.e. automatically applied traversal steps, some of which can limit access to the graph. Of particular interest in this respect are the ReadOnlyStrategy and the SubgraphStrategy. The ReadOnlyStrategy raises an exception when a traversal intends to mutate the underlying graph. The SubgraphStrategy adds filter steps to a traversal, using predicates on the vertex and edge labels and vertex properties. Custom strategies for time limits or result size limits could easily be constructed.
    The Gremlin Server admin can configure different GraphTraversalSource instances for different groups of users and apply suitable strategies per user group. Then, an authorization module can easily check whether granted permissions are consistent with the GraphTraversalSource a user requests to access.

    Apart from the bytecode requests with GLVs over websockets, Gremlin Server accepts older type string-based requests which are run as regular groovy scripts. This access mode does not lend itself to implementing authorization policies based on the ReadOnlyStrategy and SubgraphStrategy, because a groovy script running on Gremlin Server can simply access the underlying graph with a "g.getGraph()" call. While bytecode requests do not suffer from this link between the TinkerPop APIs, bytecode requests could still circumvent the authorization policies by using lambdas or by simply removing the controlling strategies. This implies that authorization in Gremlin Server requires some bytecode inspection to prevent usage of these mechanisms.

    Proposal

    Introducing authorization to Gremlin Server need not ₋ and probably should not ₋ be a big-bang operation. The current proposal has a minimum first step in its scope, which includes:
    • Open up Gremlin Server to authorization plugins in a way analogous to the authentication plugin mechanism. This implies that for user requests that do not get authorized a suitable error message is returned to the user. Note that the presence of an authorization plugin would be optional, so the mechanism would be non-breaking for existing applications.
    • The authorization plugin would receive the authenticated user as ascertained by the authentication plugin as well as the user request in the form of either a bytecode request or a string-based request. Typically, an authorization plugin would use the user request (traversal) to derive some relevant features, like the GraphTraversalSource accessed and the application of lambdas, OLAP, strategies, etc. It is matter of taste whether to only provide the raw request on the plugin API or also the derived features that are of interest to multiple plugins.
    • To illustrate the use of the authorization plugin mechanism a sample WhiteListAuthorizer is provided that allows for configuring a limited authorization policy through a yaml file. Typical use case is a data science team that wants to be able to directly query the graph store of a large graph application that only has authorization functionality on its GUI.
    So, this first step supports an initial use case and lowers the barrier for more ambitious integrations with larger stores for authorization data, like an LDAP store, Apache Ranger and Active Directory.

    Readers wanting to follow the fate of this proposal can do so by visiting:

    [Edited] A pull request implementing this proposal and additional comments from the TinkerPop community was merged on December 24, 2020.

      2020-03-22

      Regulate supermarket access during crisis using IT

      The world currently needs social distancing to keep the number of COVID-19 infected people at manageable numbers. This gives rise to two problems related to the access to food in supermarkets:
      1. People try to get more food than they currently need to protect their loved ones in the near future. This urge might easily increase as more people get killed and food supply chains get disturbed by the illness of employees.      
      2. More people are present in the supermarkets than necessary. This is on its turn related to more people being at home, half-filled supermarkets, parents bringing their children, etc.

      Social distancing in supermarkets could be improved enormously by regulating the access to supermarkets in the following way:
      • Each household gets two single-person entry tickets per week at allotted times for buying groceries to a certain maximum amount.
      • The elderly or people needing care can authorize someone to do their shopping.

      Providing and checking the entry tickets provides an IT-challenge, but many building blocks are already present:
      • Musea issue entry tickets at allotted times for blockbuster exhibitions
      • Theaters, football stadiums, etc. already scan tickets
             
      • In the developed world, governments have an accurate overview of households and their members for at least 90% of the population
      • Many governments issued digital identities that can be applied for the described authorizations.       

      Of course, embarking on a course of regulating supermarket access during crisis also has its risks:
      • People might get even more worried (but it might as well give people more trust in the government handling the crisis)    
      • Digital access is still a problem for many people, especially the elderly and the homeless.
      • Governments might later abuse the data gathered during the issuing of tickets, thus trespassing on civil liberty rights.    

      Given these consideration, I feel personally that regulating and controlling supermarket access during the current COVID-19 crisis offers enormous potential to improve social distancing and slow the spread of the virus.

      2019-07-18

      Running the latest Apache Spark version on an existing Hadoop cluster

      The other day I heard that two colleagues had managed to run the latest Apache Spark release on an ageing HDP-2.6.x Hadoop cluster. I figured that was cool because I had tried to run Apache TinkerPop's OLAP queries on the same Hadoop cluster without success and knew that a Google search on this issue does not return any usable resources.

      While replaying their experiment on my own machine, I hit upon a small configuration issue that did lead me to a previous description of the central idea needed to run vanilla Apache Spark on a commercial Hadoop distribution. It seems, however, that the author used a potentially offensive word in his blog title (which I will not repeat here for obvious reasons) that prevented the blog from appearing in the top 10 results of any Google query on the subject. So, the main intention of my current blog is to get this useful information higher in the Google search results. Posting on blogger.com might also help in this. While at the job, I will provide some addditional details.

      The central idea is that you use the vanilla spark-2.4.3-bin-without-hadoop binary distribution. At first sight this seems counterintuitive: Spark provides binary distribution for the various Hadoop versions and the distribution without Hadoop seems only geared towards a stand-alone Spark cluster. However, on second thought it is only logical: the compatibility issues between vanilla Spark and commercial Hadoop distributions arise from the fact that commercial parties like Cloudera, (former) HortonWorks and MapR backport new Hadoop features into older Hadoop versions to satisfy their need for "stable" versions. The issue I ran into with HDP-2.6.x is that Hadoop services could raise HA-related exceptions that are not known to the vanilla Hadoop client. So, this also renders any vanilla Spark version unusable. By using spark-*-without-hadoop you can simply add your cluster's Hadoop binaries to your application classpath and everything will be fine.

      Of course, there are still some catches. Apparently, in the case I described Hortonworks only modified the API'so of Hadoop services and left the API of the Hadoop client untouched. But one day, a provider could decide to "optimize" the interworking between the Hadoop client and Spark. Also, putting the complete Hadoop client binaries on your classpath bears the risk of additional dependency conflicts compared to the set of transitive dependencies that you already would get from just using Spark.

      After seeing the logic of using spark-*-without-hadoop, the actual job configuration is surprisingly simple. The examples below assume that you have the spark-2.4.3-bin-without-hadoop distribution available in /opt. You only need the distribution on your local machine, not on the cluster. For the particular case of HDP, the various configuration files present in /etc/hadoop/conf require the hdp.version system property to be set on the JVM's of both the spark driver and the Yarn application manager. Other commercial distributions may have similar requirements.

      Happy sparking!


      Spark Java:

      export SPARK_HOME=/opt/spark-2.4.3-bin-without-hadoop
      export SPARK_MAJOR_VERSION=2
      export SPARK_DIST_CLASSPATH=$(hadoop classpath)
      export HADOOP_CONF_DIR=/etc/hadoop/conf/

      $SPARK_HOME/bin/spark-submit --master yarn --deploy-mode client \
                  --class org.apache.spark.examples.SparkPi \
                  --conf "spark.driver.extraJavaOptions=-Dhdp.version=2.6.2.0-205" \
                  --conf "spark.yarn.am.extraJavaOptions=-Dhdp.version=2.6.2.0-205" \
                  $SPARK_HOME/examples/jars/spark-examples_2.11-2.4.3.jar


      PySpark:

      export SPARK_HOME=/opt/spark-2.4.3-bin-without-hadoop
      export SPARK_MAJOR_VERSION=2
      export SPARK_DIST_CLASSPATH=$(hadoop classpath)
      export HADOOP_CONF_DIR=/etc/hadoop/conf/
      export PYSPARK_PYTHON=/opt/rh/rh-python36/root/usr/bin/python

      $SPARK_HOME/bin/spark-submit --master yarn --deploy-mode client \
                  --conf "spark.driver.extraJavaOptions=-Dhdp.version=2.6.2.0-205" \
                  --conf "spark.yarn.am.extraJavaOptions=-Dhdp.version=2.6.2.0-205" \
                  $SPARK_HOME/examples/src/main/python/pi.py


      2019-06-16

      Transferring a subgraph from Janusgraph to Neo4j


      This blog will not go in very much technical detail, but merely addresses the fact that a Google search on the blog title does not guide you to any immediately usable resource. Yet, I think this is a relevant use case. While the JanusGraph backends enable you to store and query huge datasets in a linearly scalable way, data science teams often prefer to work on smaller subsets of the graph data in the Neo4j clients because of the better support for visual data exploration and for mixing in additional data.

      Exporting a subgraph from JanusGraph

      The gremlin query language has a dedicated subgraph "step" to include edges and their attached vertices into a dataset that can be operated upon as a graph. The code below, for execution in the gremlin console, extracts some data from the Graph of the Gods sample graph and subsequently writes it to a file in the graphML format.

      graph = JanusGraphFactory.open("inmemory")
      GraphOfTheGodsFactory.loadWithoutMixedIndex(graph,true)
      g = graph.traversal()
      
      
      subGraph = g.V().has('name', 'jupiter').bothE().subgraph('jupiter').cap('jupiter').next()
       
      stream = new FileOutputStream("data/jupiter.xml")
      GraphMLWriter.build().vertexLabelKey("labels").create().writeGraph(stream, subGraph)
      
      
       
      This code uses the GraphMLWriter class.  While the TinkerPop reference documentation prescribes the .io() method, you will see that the resulting graphML output uses the default "labelV" key to indicate the label of a vertex. However, this is not recognized by the Neo4j apoc plugin and you will get a Neo4j graph without vertex labels. Rather, the "labels" key should be used to have Neo4j understand that the key refers to vertex labels. This is possible by using the GraphMLWriter class. The mismatch in use of label keys probably occurred because TinkerPop supports a single vertex label only, while Neo4j supports multiple vertex labels.

      Importing the graphml file into Neo4j

      call apoc.import.graphml('../janusgraph-0.3.1-hadoop2/data/jupiter.xml', {batchSize: 10000, readLabels: true, storeNodeIds: false, defaultRelationshipType:"RELATED"})
      
      MATCH (n) RETURN n
      

      This is straight from the Neo4j documentation, which works after you find out about the required "labels" key in the graphML data.

      References

      https://tinkerpop.apache.org/javadocs/3.3.4/full/org/apache/tinkerpop/gremlin/structure/io/graphml/GraphMLWriter.html
      https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/721

      2019-03-31

      Circular references in Plotly/Dash

      Plotly Dash is a simple python web framework for quickly building interactive data visualizations. While Dash has a lot of power, it also has its limitations. One of the current limitations is that the dependencies between the Dash web components cannot be circular. If you create a callback for an output that is also part of the inputs, Dash will raise an exception that "this is bad". If you try to circumvent this check by creating two components having callbacks with interchanged inputs and outputs, the browser just shows "Error loading dependencies".

      So, are circular references really "bad"? I would rather say that forbidding them is a consequence of the Dash principal to have each callback make a change in the resulting web page. But in fact, Dash is not consequent on this principal already:
      •  Dash defines a PreventUpdate exception that one can raise when the application's state does not require an update of the web page. This would be the pythonic way to handle intended circular references and is suggested in the previously linked Dash issue.
      • From the Dash gotchas: "If you have disabled callback validation in order to support dynamic layouts, then you won't be automatically alerted to the situation where a component within a callback is not found within a layout. In this situation, where a component registered with a callback is missing from the layout, the callback will fail to fire. For example, if you define a callback with only a subset of the specified Inputs present in the current page layout, the callback will simply not fire at all."
      It turns out that his second "feature" can be exploited as a temporary workaround to have circular references, but before doing that, let us consider our use case. I simply wanted the style of checkboxes that you find in the column filters of Microsoft Excel and LibreOffice Calc, because I figured that my users would be accustomed to these. The list of checkboxes consists of one "select all" checkbox and the list of remaining checkboxes. Checking the "select all" box affects the state of one or more of the remaining checkboxes, while checking any of the remaining boxes can affect the state of the "select all" box. So, if we model our list of checkboxes with two Dash dcc.Checklist components, we certainly have circular dependencies.

      Exploitation of the Dash gotcha works by including a Dash component in the loop, in the code example below the html.Div with id='loop_breaker'. This html.Div is dynamically generated inside the static html.Div with id='loop_breaker_container'. The 'loop_breaker' component is only generated when:
      • the user has deselected all options while the "all" checkbox is still checked
      • the user has selected all options while the "all" checkbox is still unchecked
      In this way, we get our intended circular reference while the initial state of the application layout passes the Dash validation criteria on circular references.


      # -*- coding: utf-8 -*-
      import dash
      import dash_core_components as dcc
      import dash_html_components as html
      from dash.dependencies import Input, Output, State
      
      external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
      
      app = dash.Dash(__name__, external_stylesheets=external_stylesheets)
      app.config['suppress_callback_exceptions'] = True
      
      app.layout = html.Div(children=[
          html.H4(children='Excel-like checkboxes'),
          dcc.Checklist(
              id='all',
              options=[{'label': 'all', 'value': 'all'}],
              values=[]
          ),
          dcc.Checklist(
              id='cities',
              options=[
                  {'label': 'New York City', 'value': 'NYC'},
                  {'label': 'Montréal', 'value': 'MTL'},
                  {'label': 'San Francisco', 'value': 'SF'}
              ],
              values=['MTL', 'SF']
          ),
          html.Div(id='loop_breaker_container', children=[])
      ])
      
      
      @app.callback(Output('cities', 'values'),
                    [Input('all', 'values')])
      def update_cities(inputs):
          if len(inputs) == 0:
              return []
          else:
              return ['NYC', 'MTL', 'SF']
      
      
      @app.callback(Output('loop_breaker_container', 'children'),
                    [Input('cities', 'values')],
                    [State('all', 'values')])
      def update_all(inputs, _):
          states = dash.callback_context.states
          if len(inputs) == 3 and states['all.values'] == []:
              return [html.Div(id='loop_breaker', children=True)]
          elif len(inputs) == 0 and states['all.values'] == ['all']:
              return [html.Div(id='loop_breaker', children=False)]
          else:
              return []
      
      
      @app.callback(Output('all', 'values'),
                    [Input('loop_breaker', 'children')])
      def update_loop(all_true):
          if all_true:
              return ['all']
          else:
              return []
      
      
      if __name__ == '__main__':
          app.run_server(debug=True)