Theory Seminar Spring 2009

Theory Seminar Schedule – Spring 2009

Date	Speaker	Title
Jan 13, 2009	Nate Xiaoming Xu	A Generic Top-Down Dynamic-Programming Approach to Prefix-Free Coding
Jan 27, 2009	Venkatesan Guruswami	Coding theory in pseudorandomness: A survey of two recent applications
Feb 03, 2009	Youngtae Youn	Tackling the Poor Assumptions of Naive Bayes Text Classifiers
Feb 10, 2009	Abhradeep Guha Thakurta	Principal Component Analysis, Singular Value Decompositions and the Random Projection Method
Feb 17, 2009	Fang Song	Hard Core Predicates for One Way Functions
Feb 24, 2009	Madhav Jha	Basic techniques in derandomization
Mar 3, 2009	Huiwen YU	Expander Graphs
Mar 17, 2009	Piotr Berman	Covering by angle sectors
Mar 24, 2009	Youngtae Youn	Basing Identity Based Encryption on Trapdoor Permutations
Apr 7, 2009	Abhradeep Guha Thakurta	Confluence of Learning Theory and Differential Privacy
Apr 14, 2009	Madhav Jha	Derandomization under complexity assumptions: The Nisan-Wigderson Construction
Apr 21, 2009	Huiwen YU	Explicit construction of Expander Graphs using Zig-Zag Product
Apr 28, 2009	Fang song	Lossy Trapdoor Functions: Applications and Realization

All talks will be at 3:15 pm on Mondays in IST 333 (unless they are part of the departmental colloquium).

If you would like to speak, please email: theory at cse psu dot edu.

If you want to receive announcements of upcoming talks join the theory mailing list.

A Generic Top-Down Dynamic-Programming Approach to Prefix-Free Coding (joint work with Mordecai Golin and Jiajin Yu, published at SODA 2009)

Given a probability distribution over a set of n words to be transmitted, the Huffman Coding problem is to find a minimal-cost prefix free code for transmitting those words. The basic Huffman coding problem can be solved in O(n log n) time but variations are more difficult. One of the standard techniques for solving these variations utilizes a top-down dynamic programming approach. In this paper we show that this approach is amenable to dynamic programming speedup techniques, permitting a speedup of an order of magnitude for many algorithms in the literature for such variations as mixed radix, reserved length and one-ended coding. These speedups are immediate implications of a general structural property that permits batching together the calculation of many DP entries.

Coding theory in pseudorandomness: A survey of two recent applications

There is a rich interplay between coding theory and computational complexity theory that has enriched both disciplines over the years. In particular, error-correcting codes have been instrumental in several advances in the subject of pseudorandomness, leading to explicit constructions of objects (such as expander graphs, randomness extractors, pseudorandom generators, matrices, etc.) with desirable properties similar to those achieved by random objects. This talk will survey recently discovered applications of coding-theoretic ideas in pseudorandomness, showcasing two examples: the construction of lossless expanders and randomness extractors from a variant of Reed-Solomon codes, and the construction of Euclidean sections and compressed sensing matrices from expander codes.

Tackling the Poor Assumptions of Naive Bayes Text Classifiers

An influential paper by Rennie et al. [1] identified many of the modeling weaknesses of the Naive Bayes classifier and proposed various ways of improving the model. One such weakness, the ‘skewed data bias’ supposedly causes Naive Bayes to prefer classes with more data points. We show that the reasoning supporting this claim is flawed and that experiments also show no significant bias. Instead we show that variance, not bias is the bigger cause of performance degradation and we propose a variance reduction technique to improve the performance of Naive Bayes.

Bibliography [1] Jason D. Rennie, Lawrence Shih, Jaime Teevan and David R. Karger, Tackling the Poor Assumptions of Naive Bayes Text Classifiers, International Conference of Machine Learning 2003. http://www.cse.psu.edu/~ytyoun/NB/rennie.icml03.pdf

Principal Component Analysis, Singular Value Decompositions and the Random Projection Method

Dimension reduction is a key component of present day data analysis. Given a data set, one would like to find a small number of features that capture the most interesting relationships among the data points. Principal Component Analysis (PCA) is an extremely efficient tool which helps in dimensionality reduction. The first half of this talk will define PCA and discuss some applications. In the second half, we will discuss a randomized algorithm for fast approximate PCA, based on random projections.

Hard Core Predicates for One Way Functions

I will introduce the concepts of one way functions and hard core predicates, and in detail I will show the construction and proof of Goldreich-Levin hard core predicates for one way functions. This topic is contained in [1] and the exposition in [2] is somewhat easier to follow.

References: [1] Goldreich, Foundations of Cryptography. vol 1. chapter 2. [2] Katz-Lindell, Intro. to Modern Cryptography. chapter 6.

Basic techniques in derandomization

The talk will motivate the problem of finding deterministic versions of randomized algorithms and illustrate two basic derandomization techniques: the method of conditional expectations and the generation of pairwise-independent random variables.

Expander Graphs

Topics to be covered are: 1. Proof of the existence of expanders using probabilistic method; 2. Rapid mixing on expanders; 3. Explicit construction using Zig-zag product.

References: [1] S. Hoory, N. Linial , and A. Wigderson. “Expander Graphs and their Applications”. Bull. Amer. Math Soc., 43, pp 439–561, 2006 [2] O. Reingold, S. Vadhan, A. Wigderson. “Entropy Waves, The Zig-Zag Graph Product, and New Constant-Degree Expanders and Extractors”. Proceedings of the 41st FOCS, pp. 3-13, 2000.

Covering by angle sectors

We consider problems of set cover and capacitated set cover where the elements to be covered are points on Euclidean plane, and sets are angle sectors of some fixed area, such a set is an intersection of an angle sector of angle width 1/r with a disk of radius r (both with centers at the origin). We show that the minimum set cover can be found in time O(n^4) using space O(n^2). In capacitated version of the problem every element has a weight, and a set used in the cover has to satisfy two conditions: (a) it is a subset of an explicitely specified set (b) the sum of weights is at most 1 If we are given an exact solution to an instance of set cover, and we are also given weights of the elements, we can find an approximate solution to that capacitated instance with approximation ratio 2.347. (Joint work with Karpinski and Lingas, follow-up of joint work with Jeong, Kasiviswanathan and Urgaonkar.)

Basing Identity Based Encryption on Trapdoor Permutations

Identity-based encryption (IBE) is a special type of public key encryption scheme where a user’s public key is a unique information about the identity of the user (e.g., a user’s email address). Known constructions of IBE are based on the particular properties of hard problems in elliptic curves and integer lattices. So it is natural to ask whether a secure IBE can be constructed from generic cryptographic primitives. In 2008, Boneh et al. [Bon00] showed that there is no black-box construction of IBE from trapdoor permutations (TDP). In this talk, I will explain the actual separation proof and the meaning of this negative result, along with some topics in complexity theory such as reduction, oracle, and relativization.

References [Bon00] Dan Boneh, Periklis A. Papakonstantinou, Charles Rackoff, Yevgeniy Vahlis, Brent Waters. On the Impossibility of Basing Identity Based Encryption on Trapdoor Permutations. FOCS 2008.

Confluence of Learning Theory and Differential Privacy

The talk covers: 1.Some very basic overview of Vapnik-Chervonenkis dimension. 2. Prove the Uniform Convergence Theorem. 3. Use these concepts to design a mechanism to generate a synthetic dataset which preserves some of the important properties of the original dataset but maintaining differential privacy.

References 1. A Learning Theory Approach to Non-Interactive Database Privacy. Avrim Blum, Katrina Ligett, Aaraon Roth. 2. Mechanism Design via Differential Privacy. Frank McSherry, Kunal Talwar. 3. Calibrating Noise to Sensitivity in Private Data Analysis. Cynthia Dwork,Frank McSherry, Kobbi Nissim, Adam Smith.

Derandomization under complexity assumptions: The Nisan-Wigderson Construction

Abstract: The fundamental problem that complexity theory addresses is following: What do we mean by efficiently solvable problem? Two natural choices for defining such a class have emerged: P : the class of problems solvable by deterministic algorithms running in polynomial time, and, BPP: the class of problems solvable by polynomial time *randomized* algorithms with bounded error probability. It is thus natural to ask if these classes coincide, that is, is P = BPP? It turns out that such a statement can be conditionally proven based on complexity assumptions. It all started with the seminal work of Nisan and Wigderson who showed that if there was a function computable in exponential time that was “exponentially hard” for circuits of slightly less than exponential size, then BPP=P. We will look at their construction and see the proof of their result.

References: O. Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.

N. Nisan and A. Wigderson. Hardness vs. randomness. Journal of Computer Systems and Sciences, volume 49,149-167, 1994.

Explicit construction of Expander Graphs using Zig-Zag Product

Abstract: The talk is going to be on “Explicit construction of Expander Graphs using Zig-Zag Product”. The construction was proposed by the paper “Entropy Waves, The Zig-Zag Graph Product, and New Constant-Degree Expanders and Extractors” by O. Reingold, S. Vadhan, and A. Wigderson. The construction as well as proof of correctness will be covered in the talk.

Lossy Trapdoor Functions: Applications and Realization

Abstract: Two long-standing problems in cryptography are to construct trapdoor functions from assumptions not directly related to factoring and to design chosen ciphertext secure crypto-systems based on lattices problems. The recent paper [PW08] proposed a new primitive, called a lossy trapdoor function, and use the idea to solve both problems. I will introduce this new notion and give the construction of trapdoor functions and collision-resistant hash functions based on it. Finally, I will show how to realize lossy trapdoors from the DDH assumption.

References: 1] Chris Peikert, Brent Waters: Lossy trapdoor functions and their applications. STOC 2008

[2] Oded Regev: On lattices, learning with errors, random linear codes, and cryptography. STOC 2005