Computer Science Graduate Research Seminar

Organized by: Dr. Jianhua Ruan with support from CS faculty (especially Drs. Weining Zhang, Meng Yu, John Quarles, Ali Tosun and Qi Tian) and CS Graduate Student Association (CSGSA).
Time: 1:00-2:00 pm, Thurs (unless otherwise indicated below)
Place: NPB 3.108, CS Conference Room (unless otherwise indicated below)

Previous Seminars

Schedule for Spring 2018

Seminar information

3/29 A: "Scientific Investigation of Data Transmissions in IoT Devices." by Olumide Kayode

Abstract: In this current era of Internet of Things, data privacy and security of internet enabled devices have become a major concern of many users and manufacturers. Massive amount of data are being generated by these IoT devices and there might be possibilities of users information being exposed without any privacy protection. The rate of data transfer, size, kinds of information transmitted and secure channels used by these IoT devices are of utmost importance and demands more exploratory research. In this work, we explore and investigate the data being transmitted by five IoT devices (TP-Link Smart Wi-Fi Bulb, TP-Link Smart Wi-Fi Plug, Insignia Wi-Fi Smart Plug, Fitbit Wi-Fi Smart Scale and Nixplay Seed Wi-Fi Cloud Frame) and analyze the data, using a proxy for the collection of both http and https traffics. We aim towards identifying any vulnerabilities in the data transmission of these IoT devices.

3/29 B: "Security and Privacy of Cyber and Physical User Interactions in The Age of Wearable Computing" by Anindya Maiti

Abstract: Wearable devices are a new form of technology that is quickly gaining popularity among mobile users. These 'smart' wearable devices are equipped with a variety of high-precision sensors that enable the collection of rich contextual information related to the wearer and his/her surroundings which in turn, enable a variety of novel applications. The presence of a diverse set of zero-permission sensors, however, also expose an additional attack surface which, if not adequately protected, could be potentially exploited to leak private user information. The first part of my research aims to develop a comprehensive technical understanding of the privacy risks associated with inference of private user interactions with other cyber and physical systems, primarily using wrist-wearables. A detailed evaluation of novel attack frameworks validates the feasibility of inference attacks on cyber interfaces, (such as mobile keypads and external computer keyboards) and on physical systems (such as padlocks and safes). In order to thwart these upcoming privacy threats, effective and usable techniques for detection and mitigation of wearable device misuse will be critical and urgently needed. Consequently, the second part of my research aims to protect user interactions by proposing new protection measures, which take two different strategies. The proposed design-time protection measures try to prevent inference attacks by altering the interaction interfaces, and run-time protection measures use contextual information to dynamically regulate zero-permission sensor data when users are detected to be vulnerable to a known inference attack.


Back to top

4/5 A: " Quantifying the Security Effectiveness of Firewalls and DMZs" by Huashan Chen

Abstract: Firewalls and Demilitarized Zones (DMZs) are two mechanisms that have been widely employed to secure enterprise networks. Despite this, their security effectiveness has not been systematically quantified. In this paper, we make a first step towards filling this void by presenting a representational framework for investigating their security effectiveness in protecting enterprise networks. Through simulation experiments, we draw useful insights into the security effectiveness of firewalls and DMZs. To the best of our knowledge, these insights were not reported in the literature until now.

4/5 B: "Testing Cloud Applications under Cloud-Uncertainty Performance Effects." by Sen He

Abstract: The paradigm shift of deploying applications to the cloud has introduced both opportunities and challenges. Although clouds use elasticity to scale resource usage at runtime to help meet an application's performance requirements, developers are still challenged by unpredictable performance, little control of execution environment, and differences among cloud service providers, all while being charged for their cloud usages. Application performance stability is particularly affected by multi-tenancy in which the hardware is shared among varying applications and virtual machines. Developers porting their applications need to meet performance requirements, but testing on the cloud under the effects of performance uncertainty is dif?cult and expensive, due to high cloud usage costs. This paper presents a first approach to testing an application with typical inputs for how its performance will be affected by performance uncertainty, without incurring undue costs of bruteforce testing in the cloud. We specify cloud uncertainty testing criteria, design a test-based strategy to characterize the blackbox cloud's performance distributions using these testing criteria, and support execution of tests to characterize the resource usage and cloud baseline performance of the application to be deployed. Importantly, we developed a smart test oracle that estimates the application's performance with certain con?dence levels using the above characterization test results and determines whether it will meet its performance requirements. We evaluated our testing approach on both the Chameleon cloud and Amazon web services; results indicate that this testing strategy shows promise as a cost-effective approach to test for performance effects of cloud uncertainty when porting an application to the cloud.

Back to top

4/19 A: "TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis" by Maryam Zand

Abstract: When analyzing single-cell RNA-seq data, constructing a pseudo-temporal path to order cells based on the gradual transition of their transcriptomes is a useful way to study gene expression dynamics in a heterogeneous cell population. Currently, a limited number of computational tools are available for this task, and quantitative methods for comparing different tools are lacking. Tools for Single Cell Analysis (TSCAN) is a software tool developed to better support in silico pseudo-Time reconstruction in Single-Cell RNA-seq ANalysis. TSCAN uses a cluster-based minimum spanning tree (MST) approach to order cells. Cells are first grouped into clusters and an MST is then constructed to connect cluster centers. Pseudo-time is obtained by projecting each cell onto the tree, and the ordered sequence of cells can be used to study dynamic changes of gene expression along the pseudo-time. Clustering cells before MST construction reduces the complexity of the tree space. This often leads to improved cell ordering. It also allows users to conveniently adjust the ordering based on prior knowledge. TSCAN has a graphical user interface (GUI) to support data visualization and user interaction. Furthermore, quantitative measures are developed to objectively evaluate and compare different pseudo-time reconstruction methods. TSCAN is available at and as a Bioconductor package.

4/19 B: "De novo pathway-based biomarker identification" by Nahim Adnan

Abstract: Gene expression profiles have been extensively discussed as an aid to guide the therapy by predicting disease outcome for the patients suffering from complex diseases, such as cancer. However, prediction models built upon single-gene (SG) features show poor stability and performance on independent datasets. Attempts to mitigate these drawbacks have led to the development of network-based approaches that integrate pathway information to produce meta-gene (MG) features. Also, MG approaches have only dealt with the two-class problem of good versus poor outcome prediction. Stratifying patients based on their molecular subtypes can provide a detailed view of the disease and lead to more personalized therapies. We propose and discuss a novel MG approach based on de novo pathways, which for the first time have been used as features in a multi-class setting to predict cancer subtypes. Comprehensive evaluation in a large cohort of breast cancer samples from The Cancer Genome Atlas (TCGA) revealed that MGs are considerably more stable than SG models, while also providing valuable insight into the cancer hallmarks that drive them. In addition, when tested on an independent benchmark non-TCGA dataset, MG features consistently outperformed SG models. We provide an easy-to-use web service at where users can upload their own gene expression datasets from breast cancer studies and obtain the subtype predictions from all the classifiers.

Back to top

4/26 A: "Hybrid.AI: A Learning Search Engine for Large-scale Structured Data" by Sean Soderman

Abstract: Variety of Big data is a significant impediment for anyone who wants to search inside a large-scale structured dataset. For example, there are millions of tables available on the Web, but the most relevant search result does not necessarily match the keyword-query exactly due to a variety of ways to represent the same information. Here we describe Hybrid.AI, a learning search engine for large-scale structured data that uses automatically generated machine learning classifiers and Unified Famous Objects (UFOs) to return the most relevant search results from a large-scale Web tables corpora. We evaluate it over this corpora, collecting 99 queries and their results from users, and observe significant relevance gain.

References: Sean Soderman, Anusha Kola, Maksim Podkorytov, Michael Geyer, Michael Gubanov. 2018. Hybrid.AI: A learning Search Engine for Large-scale Structured Data. In WWW '18 Companion: The 2018 Web Conference Companion, April 23-27, 2018, Lyon, France. ACM, New York, NY, USA, 8 pages.

4/26 B: "Mining roles for successful RBAC deployment" by Shuvra Chakraborty

Abstract: Designing set of optimal roles along with user role and role permission assignment is the most challenging part of RBAC deployment. To ease the process of role design, Role Mining(RM) starts with the assumption that a set of user permission assignment and possibly other information(optional) as per requirement are already given. To accomplish the task of role mining, a good number of algorithms are already there. In my discussion, I am focused on brief categorization of role mining and explanation of a role mining approach which considers both role and policy quality metrics to find an optimal set of roles while preserving consistency with the given user permission assignment.


Back to top

Questions and Comments?

Please send emails to