Where
INRIA (Paris)
23, avenue d'Italie
75013 Paris
Meeting will take place in Room Bleu 1 (6 Floor)
Day One
Sessions
9:30am | 10:00am | Arrival / "Tour de table" |
10:00am | 11:00am | Replication / Algorithms |
11:00am | 11:30am | break |
11:30am | 12:30pm | Replication / Algorithms 2 |
12:30pm | 1:30pm | lunch |
1:30pm | 3:00pm | Replication / Theory |
3:00pm | 3:30pm | break |
3:30pm | 5:00pm | Replication / Theory 2 |
Participants
SCORE: Gérald Oster, Luc André, Pascal Urso, Mehdi-Ahmed Nacer, Hyun-Gul Roh
REGAL: Marc Shapiro, Marek Zawirski, Masoud Saeida Ardekani, Pierpaolo Cincilla, Lokesh Gidra, Mesaac Makpangou
CASSIS: Abdessamad Imine, Hoang Bao Thien
ASAP: Stéphane Weiss
XWiki SAS: Fabio Mancinelli
GDD: Pascal Molli
UNL: Nuno Preguica
Designing an advanced CRDT: a graph data structure for asynchronous processing of web data
Presenter: Nuno Preguiça Documents: PDF
Notes.
Graph CRDT
Web structure represented as directed graphs
Web evolves therefore the graph has to be updated
Web pages processed concurrently by multiple servers / incremental processing
State-based solution (based on observed-remove sets)
Two sets: nodes + arcs
Operations:
- (updates): addNode(n), removeNode(n), addArc(n1, n2), removeArc(n1, n2)
- query (reads): lookupArc((n1,n2))
Garbage collection mechanism based on state vectors to avoid tombstones
Snapshots management mechanism computed using state vectors
- useful for? support access to consistent data in transactions and data evolution history
Pascal U.: gc looks like the managenement in logoot? why dont use URI/URL as unique identifier? because the referenced content can be added/removed multiple times, remove operation removes all observed uids associated to an URL.
Marc: You cannot remove immediately, right?
Nuno: No I can.
Abdessamad: What about versions vectors limitations?
Nuno: In our context, cloud computing, we know the limited number of clusters and their unique identifier
Pascal M: So, basically, the scope of applications of CRDT has been reduced. At the beginning Woot/Logoot were proposed for peer-to-peer networks. You can use consensus in this context, no?
Nuno: Well consensus might not be suitable. For instance Amazon does not use consensus (may be because of data-centers spread over the world).
Marc: If you want high-performance you cant put consensus in your loop
Nuno: cloud computing literature states optimistic replication is used
C-Set: A Commutative Replicated Data Type for Semantic Stores
Presenter: Pascal Molli Documents: PDF
Notes.
Presented at REsource Discovery 2011 workshop (http://ldc.usb.ve/~mvidal/RED2011/) co-located with Extended Semantic Conference (ESWC 2011).
Context: Social web is adopting semantic technologies and is generating massive new semantic datasets.
Challenge: synchronizing semantic stores (very large data sets, autonomous participants, etc.)
CRDT for semantic store: semantic data can be represented as sets of triples.
Need a CRDT for sets.
Set (real semantic) is not a CRDT.
Proposal: C-Set
S = {(e, count) : e \in elements, count \in Z}
local operations: ins(e: element), del(e:element)
remote operations: rins(e:element, k: Z), rdel(e:element, k: Z)
Current proposal does not preserve intentions (see counter-example)
Marc: Did you try OR-set?
Pascal M.: yes, i think it will work, but we will loose on another point (vectors? tombstones?)
Telex Light: a platform for cooperative social applications
Presenter: Pierpaolo Cincilla Documents: PDF Δ
Notes.
Telex: A communication infrastructure for collaborative nomadic applications
Demo
The cost of consistency in large-scale replication
Presenter: Masoud Saeida Ardekani Documents: PDF Δ
Notes.
Full replication, partial replication (atomic broadcast), genuine replication (atomic multi-cast)
Snapshot isolation (read latest snapshot), generalized snapshot isolation (read any snapshot)
Consistent snapshots are determined using concurrent version vectors
Generalized snapshot isolation (GSI) : snapshot monotonicity
Read-write dependence vector (RWDV)
More scalable than other Genuine approach-based systems (due to relaxing of monotonicity), or partial replication-based (latency greater due to atomic broadcast requirement).
Asynchronous re-balancing of a replicated tree
Presenter: Marek Zawirski Documents: PDF
Notes.
Delta (previous work) : Novel catch-up mechanism based on symbolic positions
State of the art in using semantic information to ensure consistency
Presenter: Marek Zawirski Documents: PDF
Notes.
Bloom [Alvaro et al. 2011]: Use monotonic logic programming model to encourage create pieces of program can run concurrently. What about writing/expressing CRDT in Bloom?
http://www.bloom-lang.net/
ESCOPADS: Earth-SCale, COnsistent, Privacy-preserving, Autonomic Data Service
Presenter: Marc Shapiro
Notes.
Distributed search engine/crawler/index/precomputed queries/... based on CRDTs
Day Two
Sessions
9:30am | 10:30am | Evaluation / Experimentations |
10:30am | 11:00am | break |
11:00am | 12:30pm | Requirements / Architecture |
12:30pm | 1:30pm | lunch |
1:30pm | 3:00pm | Security |
3:00pm | 3:30pm | break |
3:30pm | 16:30pm | Coordination / Discussions |
Participants
SCORE: Gérald Oster, Luc André, Pascal Urso, Mehdi-Ahmed Nacer, Hyun-Gul Roh, Claudia-Lavinia Ignat, Hien Thi Thu Truong
REGAL: Marc Shapiro, Marek Zawirski, Pierpaolo Cincilla, Lokesh Gidra
CASSIS: Michaël Rusinowitch, Hoang Bao Thien
ASAP: Stéphane Weiss
XWiki SAS: Fabio Mancinelli
GDD: Pascal Molli
UNL: Nuno Preguica
Evaluating CRDTs for Real-time Document Editing
Presenter: Mehdi Ahmed-Nacer Documents: PDF
Abstract. TBC
Notes.
Nuno: User operations are captured or computed with a diff algorithm?
Mehdi: They are captured when you type in.
Marek: You mean your logs store the causal relation between operation?
Mehdi: Yes
Pascal U.: We replay the history while taking into account causal relations. But it is worse to note that concurrent operations might be replayed in an order different from the real one.
Nuno: remark about average value without deviation
Nuno: why OT does not perform that well in the presented scenario?
Marc: Any differences with the results (performance evaluation) observed with serialized traces from wikipedia?
Marc: Any experimentation results for memory consumption?
Marek: did you have a look on how computed final states differ between algorithms?
Discussions Related to XWiki Integration
Presenter: Fabio Mancinelli
Notes.
We need to define a format for collected traces.
Architecture / Requirements
Presenter: Stéphane Weiss
Notes.
CRDT, Telex,
have different requirements
collaboration model: users you know, users that share preferences, etc.
telex: you need to know number of users, id of users
for instance, git and wikipedia are not based on the same collaboration model. Wikipedia would never work with access right, etc.
p2p rt facebook? p2p rt googledocs? p2p rt "search engine"?
Realtime Web: it breaks the paradigm crawl/cache/search
but it is not editing!
so all cache technics do not work anymore
(as an example for a wiki, rendering have to be performed on the client)
the architecture would be completely different
editing -> propagation as fast as possible to all others users (reader, not only writers)
"TBA", Fabio Mancinelli
Discussion in Nancy when Stephane will come for several days. We might organize a meeting with Fabio too on these days.
On an API for Securing Distributed Social Networks
Presenter: Hoang Bao Thien Documents: PDF Δ
Abstract. TBC
Notes.
Interesting reference: http://www.safebook.us/home.html
A Contact extended Push Pull Clone Model
Presenter: Hien Thi Thu Truong Documents: PDF Δ
Abstract. TBC
Notes.
Discussions related to STREAMS WP4 and Deliverable L4.1 (T0+18) will take place in Nancy during a meeting in July.