Android applications pose security and privacy risks for end-users. These risks are often quantified by performing dynamic analysis and permission analysis of the Android applications after release. Prediction of security and privacy risks associated with Android applications at early stages of application development, e.g. when the developer (s) are writing the code of the application, might help Android application developers in releasing applications to end-users that have less security and privacy risk. The goal of this paper is to aid Android application developers in assessing the security and privacy risk associated with Android applications by using static code metrics as predictors. In our paper, we consider security and privacy risk of Android application as how susceptible the application is to leaking private information of end-users and to releasing vulnerabilities. We investigate how effectively static code metrics that are extracted from the source code of Android applications, can be used to predict security and privacy risk of Android applications. We collected 21 static code metrics of 1,407 Android applications, and use the collected static code metrics to predict security and privacy risk of the applications. As the oracle of security and privacy risk, we used Androrisk, a tool that quantifies the amount of security and privacy risk of an Android application using analysis of Android permissions and dynamic analysis. To accomplish our goal, we used statistical learners such as, radial-based support vector machine (r-SVM). For r-SVM, we observe a precision of 0.83. Findings from our paper suggest that with proper selection of static code metrics, r-SVM can be used effectively to predict security and privacy risk of Android applications.
7. 7
Dataset
• Dataset from Krutz et al. included 4,416 Android
applications
• 1,407 applications included AndroRisk scores.
AndroRisk is a tool that is part of the AndroGuard
toolchain
• Five risk levels: very low (VL), low (L), medium (M),
high (H), very high (VH)
http://blog.k3170makan.com/2014/11/automated-dex-decompilation-using.html
8. 8
Dataset
• Dataset from Krutz et al. included 21 code metrics
Category Metrics
Bad Coding Practice Blocker practices, Critical practices, Major practices,
Minor practices, Total bad coding practices
Duplication Duplicated blocks, Duplicated files, Duplicated lines
Object-oriented Class complexity, Comment lines, Complexity, Density of
comment lines, Files, File complexity, Function complexity,
Lines, Lines of code, Methods, Number of classes,
Percentage of comments, Percentage of duplicated lines
https://www.sonarqube.org/community/logos/