FLOC 2018: FEDERATED LOGIC CONFERENCE 2018
LEARNAUT ON FRIDAY, JULY 13TH

View: session overviewtalk overviewside by side with other conferences

09:00-10:30 Session 83G
09:00
Algorithms for Weighted and Probabilistic Context-Free Grammars and Convex Optimization
10:00
Learning Graph Weighted Models on Pictures

ABSTRACT. Graph Weighted Models (GWMs) have recently been proposed as a natural generalization of weighted automata over strings and trees to arbitrary families of labeled graphs (and hypergraphs). A GWM generically associates a labeled graph with a tensor network and computes a value by successive contractions directed by its edges. In this paper, we consider the problem of learning GWMs defined over the graph family of pictures (or 2-dimensional words). As a proof of concept, we consider the simple Bars and Stripes picture language and provide an experimental study investigating whether this language can be learned in the form of a GWM from positive and negative examples using gradient-based methods. Our results suggest that this is indeed possible and that investigating the use of gradient-based methods to learn picture series and functions computed by GWMs over other families of graphs could be a fruitful direction.

10:30-11:00Coffee Break
11:00-12:30 Session 86H
11:00
TBA
12:00
Learning Tree Distributions by Hidden Markov Models

ABSTRACT. Hidden tree Markov models allow learning distributions for tree structured data while being interpretable as nondeterministic automata. We provide a concise summary of the main approaches in literature, focusing in particular on the causality assumptions introduced by the choice of a specic tree visit direction. We will then sketch a novel non-parametric generalization of the bottom-up hidden tree Markov model with its interpretation as a nondeterministic tree automaton with innite states.

12:30-14:00Lunch Break
14:00-15:30 Session 87H
14:00
Weak and Strong learning of Probabilistic Context-Free Grammars: theoretical and empirical perspectives

ABSTRACT. The problem of learning context-free grammars from a finite sample of their yields is a classic problem in grammatical inference; in this talk I will look at this problem and a modern variant: namely the problem of learning probabilistic grammars (PCFGs) from a sample of strings drawn i.i.d. from the distribution over strings defined by that grammar. We will consider both the problem of learning an accurate approximation of the distribution over strings (weak learning) and the much harder problem of learning the underlying unobserved parse trees as well (strong learning) which is related to the task of unsupervised parsing in the NLP community. I will discuss some theoretical results using distributional learning approaches and present some empirical results on both synthetic and natural examples, motivated by the problem of first language acquisition.

15:00
Decision problems for Clark-congruential languages
SPEAKER: Tobias Kappé

ABSTRACT. When trying to learn a language using the Minimally Adequate Teacher model, one assumes the presence of a teacher that can answer equivalence queries for the class of languages under study --- i.e., queries of the form ``is the language equivalent to L?''. This begs the question: is it feasible to implement such a teacher, or, more precisely, is equivalence for this class of languages decidable? We consider the Clark-congruential languages, a class of context-free languages of particular interest to MAT-based learning efforts, and show that, for this class, equivalence as well as congruence are decidable.

15:30-16:00Coffee Break
16:00-18:00 Session 88F
16:00
The Learnability of Symbolic Automata

ABSTRACT. Symbolic automata allow transitions to carry predicates over rich alphabet theories, such as linear arithmetic, and therefore extend classic automata to operate over infinite alphabets, such as the set of rational numbers. In this paper, we study the problem of learning symbolic automata and exactly characterize under what conditions symbolic automata can be learned efficiently. First, we present \alg, an $L^*$-style algorithm for learning symbolic automata using membership and equivalence queries, which treats the predicates appearing on transitions as their own learnable entities. The main novelty of \alg is that it can take as input an algorithm $\Lambda$ for learning predicates in the underlying alphabet theory and it uses $\Lambda$ to infer the predicates appearing on the transitions in the target automaton. Using this idea, \alg is able to learn automata operating over alphabets theories in which predicates are efficiently learnable using membership and equivalence queries. Furthermore, we prove that a necessary condition for efficient learnability of an \SFA is that predicates in the underlying algebra are also efficiently learnable using queries. Our results settle the learnability of a large class of \SFA instances, leaving open only a subclass of \SFAs over efficiently learnable alphabet theories. We implement \alg in an open-source library and show that it can efficiently learn automata that cannot be learned using existing algorithms and significantly outperform existing automata learning algorithms over large alphabets.

16:30
Automatic clustering of a network protocol with weakly-supervised clustering

ABSTRACT. Abstraction is a fundamental part when learning behavioral models of systems. Usually the process of abstraction is manually defined by domain experts. This paper presents a method to perform automatic abstraction for network protocols. In particular a weakly supervised clustering algorithm is used to build an abstraction with a small vocabulary size for the widely used TLS protocol. To show the effectiveness of the proposed method we compare the resultant abstract messages to a manually constructed (reference) abstraction. With a small amount of side-information in the form of a few labeled examples this method finds an abstraction that matches the reference abstraction perfectly.

17:00
Toward Learning Boolean Functions from Positive Examples

ABSTRACT. In this article we investigate learning of DNF representations of boolean functions with two special focuses: (i) we consider that a sample with only positive examples is given to the learner (and hence no membership queries are allowed); and (ii) the function we consider are true on a sparse set, namely its cardinality is far smaller than $2^n$ where $n$ is the number of variables of the problem instance. After motivating this problem, we provide a conceptual solution of how much to generalize beyond the sample and where. We introduce an empirical greedy algorithm to address the problem under study and illustrate it on a randomly sampled DNF.

17:30
Learning Several Languages from Labeled Strings: State Merging and Evolutionary Approaches

ABSTRACT. The problem of learning pairwise disjoint deterministic finite automata (DFA) from positive examples has been recently addressed. In this paper, we address the problem of identifying a set of DFAs from labeled strings and come up with two methods. The first is based on state merging and a heuristic related to the size of each state merging iteration. State merging operations involving a large number of states are extracted, to provide sub-DFAs. The second method is based on a multi-objective evolutionary algorithm whose fitness function takes into account the accuracy of the DFA w.r.t. the learning sample, as well as the desired number of DFAs. We evaluate our methods on a dataset originated from industry.

19:00-21:30 Workshops dinner at Keble College

Workshops dinner at Keble College. Drinks reception from 7pm, to be seated by 7:30 (pre-booking via FLoC registration system required; guests welcome).

Location: Keble College