Online privacy risks

Rethinking the nature of risk associated with disclosure of personal information

• We are all subject to risks when we disclose information about ourselves on social media and other online services
• These risks are balanced against the benefits of signing up to social media
• The decisions that individuals make when they sign up to social media can be described by a ‘privacy calculus’ model
• Many attempts have been made to identify the risks and benefits of online services and to categorise them
• The objective of this research is to gather empirical data to support/challenge established risk models and to produce a more robust model for estimating risk levels

This study uses several methods to look at individual online behaviour in order to identify what personal data is disclosed during online sessions.  It will also explore the risks associated with disclosure.  Individual case studies of online use will provide transaction logs which will be examined to highlight any risks that arise.  The study will work through the public library system to recruit a diverse range of individuals and to provide a safe environment for follow-up interviews.  The study will also identify and interview subject experts to gain an understanding of how risk is handled in other industries (e.g. insurance, engineering safety, health) and to see how that might be applied to individual online behaviour.

The research builds on my doctoral work focused on the risks associated with access to personal data on online social networking services (Haynes 2015), which tested the idea that personal risk could be used as a way of assessing regulatory effectiveness (Haynes et al. 2016; Haynes & Robinson 2015).

Haynes, D., 2015. Risk and Regulation of Access to Personal Data on Online Social Networking Services in the UK. City University London.
Haynes, D., Bawden, D. & Robinson, L., 2016. A Regulatory Model for Personal Data on Social Networking Services in the UK. International Journal of Information Management, 36(6), pp.872–882.
Haynes, D. & Robinson, L., 2015. Defining User Risk in Social Networking Services. Aslib Journal of Information Management, 67(1), pp.94–115.

This Fellowship is supported by the Royal Academy of Engineering and the Office of the Chief Science Adviser for National Security under the UK Intelligence Community Postdoctoral Fellowship Programme.

December 2017

Ontology of Risk – presentation of a privacy risk model

Here are some slides from a demonstration of the privacy risk model that I gave online at Edinburgh Napier University on 3rd June 2020. The slides capture snapshots from the live demo using Synaptica’s Graphite software. The presentation also identifies possible avenues for development of the ontology.

Visualizing uncertainty

A recent meeting about risk communication at Birkbeck, University of London, entitled: ‘Talking Flooding: linkages between communication, risk and resilience in Flood-prone communities’ led me to think about risk and uncertainty.  Although the day was focused on flood risk, it provided a useful insight into the way one particular sector deals with risk and some of the techniques that have been developed to represent risk and perceptions of risk.  A presentation by Aidan Slingsby and Cagatay Turkay on ‘Visualisation and Uncertainty’ described different ways of representing uncertainty visually:

  • Fuzziness
  • Shading
  • Colour bands
  • Transparency
  • Resolution
  • Smoothness
  • Continuity

Further reading identified two further methods:

  • Lightness to signify uncertainty about the classification of a post code (Slingsby et al. 2011) or for projected values (Wong 2010, p.63)
  • Sketchiness to signify the fact of uncertainty (such as future projections) rather than indicating the level of uncertainty (Wood et al. 2012)

There are several aspects of uncertainty which may be to do with the accuracy of the data, its precision, the degree of ambiguity and the level of confidence in the data.  Where these can be represented numerically, they may be represented visually.  Aidan has suggested that of these measures: “some may be considered more intuitive (high uncertainty being more transparent, less light, more fuzzy-looking, etc)”.  Uncertainty may be due to poor data, an inappropriate model, or a poor understanding of the phenomenon being represented.

There is already an established visual vocabulary which is described by Bertin .(2011, pp.42–43)  He identifies eight graphic variables which form the starting point for any discussion of visualization:

  • Size
  • Value
  • Texture
  • Colour
  • Orientation
  • Shape
  • Two planar dimensions (2 variables)
Figure 1 – Bertin’s Visual Variables

We can use some of these visual variables to represent uncertainty.  When we talk about risk there are two considerations:

  • What is the probability of a risk event occurring?
  • What is the impact of the risk event (consequence)?

In exploring the first of these we face two further questions:  ‘How do we characterize a risk event?’ and ‘Are there established categories or do we have to develop a typology of our own?’ (Haynes & Robinson 2015).  For example, if someone is browsing online what is the probability that they will enter a malicious site?  Malicious sites could be defined by whether or not they are on a published list of known malicious sites.  This is probably most meaningful across a large population so that different variables can be taken into account.  For instance: frequency of online searches, online duration, experience, attitude to risk, whether or not there is anti-virus software on the device, operating system etc.  Different population groups could be examined: nationality, age group, gender identity, socioeconomic group, educational attainment etc.

Estimates of the levels of occurrence of a particular risk event extrapolated to a general population would have an upper and a lower limit.  These upper and lower estimates are likely to be imprecise and it might be useful to signal this visually.  For example, in PowerPoint it is possible to generate a chart with a blurred effect:

Figure 2 – Showing uncertainty with blurred outlines and shading

The lower range has been shown in a darker shade to emphasize the fact that it applies to the lower and upper limits of the estimate. The upper range is in a lighter colour, because it only applies to the upper range.  An alternative might be to use texturing to indicate different levels of certainty:

Figure 3 – Showing uncertainty with texture

There is clearly further work to be done on presentation of results where there is a degree of uncertainty.  A possible line of development might be to test different presentations of uncertainty to see how that would affect perceptions of risk.


Bertin, J., 2011. Semiology of Graphics: diagrams, networks, maps W. J. (translator) Berg, ed., Redlands, CA: Esri Press.

Haynes, D. & Robinson, L., 2015. Defining User Risk in Social Networking Services. Aslib Journal of Information Management, 67(1), pp.94–115.

Slingsby, A., Dykes, J. & Wood, J., 2011. Exploring Uncertainty in Geodemographics with Interactive Graphics. IEEE Transactions on Visualization and Computer Graphics, 17(12), pp.2545–2554.

Wong, D.M., 2010. The Wall Street Journal Guide to Information Graphics: the dos and don’ts of presenting data, facts and figures, New York: W.W.Norton.

Wood, J. et al., 2012. Sketchy Rendering for Information Visualization. IEEE Transactions on Visualization and Computer Graphics, 18(12), pp.2749–2758.

July, 2018

Metadata for information management and retrieval

photo taken by John Stevenson, April 2018

book launch at City, University of London on 4th April 2018

The Snowden revelations brought into sharp focus the ethical issues surrounding metadata creation and use.  It raises issues about privacy, security, ownership and control of metadata and provides a challenge to information professionals on how to manage these issues.

We also need to consider wider issues such as the digital divide and the potential that metadata has for making information accessible to wider audiences. It has the potential to empower the marginalised, hold government to account and improve quality of life.  Maybe it is also a response to the burgeoning of fake news and fact-free content.

In the new edition of Metadata for Information Management and Retrieval, published this month, I consider the origins of metadata and look at the ways in which it is used for managing information resources as well as for information retrieval.  The book covers current metadata standards and compares the ways in which they are used for managing different types of resource ranging from linked data, to images, to the more familiar text-based materials.

Haynes, D (2018) Metadata for Information Management and Retrieval: understanding metadata and its use. ISBN 9781856048248. Facet Publishing. London, 2018, 288pp. http://www.facetpublishing.co.uk/title.php?id=048248

April, 2018

Ontology of online risk

In order to handle the complex relationships between concepts in the Risk domain I am developing an ontology of risk.  This will allow for cause and effect relationships as well as the classic hierarchical relationships found in taxonomies.

This diagram highlights one particular node ‘harrassment’ and looks at what might lead to this risk as well as what consequences might result.  This is generated from a test set of data from Haynes and Robinson (2015).

Also attached is a set of slides from a workshop at which this diagram was first presented.  The workshop was held in January 2019 and hosted by ISKO UK (International Society for Knowledge Organization – UK chapter).

January 2019

The Nature of Risk and the Privacy Calculus – research questions

In December 2017 I embarked on a two-year research programme at City, University of London to investigate the nature of risks that individuals face when they disclose personal information online. This is an outline of the research questions that I set out to answer.

Rationale for the research: Public safety is improved if individual users are able to make informed choices about what personal information they disclose online.

Background: Privacy calculus is one way of describing the way in which individuals balance the perceived risks and benefits of online transactions (Krasnova et al. 2012; Dinev & Hart 2006).  This depends on the way in which risks are described.  However there is no consensus on a risk typology (Rosenblum 2007; Swedlow et al. 2009, p.237; Facebook Inc. 2017; Haynes & Robinson 2015).  One of the objectives of this research was to investigate the nature of risk associated with disclosure of personal data online based on monitoring actual user behaviour. 

Research questions: The research set out to address the following research questions:-

  • Is there a reliable typology for personal risk that can be used to analyse the privacy calculus that users engage in?
  • What is the nature of the interactions and risks that users engage in when they use the Internet?
  • Can the new risk typology be applied to existing empirical data?
  • What effect will the new categorisation of risk have on the privacy calculus?
  • Can these figures be used to improve the privacy calculus model in order to better predict online user behaviour?


Dinev, T. & Hart, P., 2006. An Extended Privacy Calculus Model for E-Commerce Transactions. Information Systems Research, 17(1), pp.61–80.

Facebook Inc., 2017. Facebook Privacy Basics. Available at: https://www.facebook.com/about/basics [Accessed April 7, 2017].

Haynes, D. & Robinson, L., 2015. Defining User Risk in Social Networking Services. Aslib Journal of Information Management, 67(1), pp.94–115.

Krasnova, H., Veltri, N.F. & Günther, O., 2012. Self-disclosure and Privacy Calculus on Social Networking Sites: The Role of Culture. Business & Information Systems Engineering, 4(3), pp.127–135.

Rosenblum, D., 2007. What Anyone Can Know: the privacy risks of social networking sites. IEEE Security & Privacy, 5(3), pp.40–49.

Swedlow, B. et al., 2009. Theorizing and Generalizing about Risk Assessment and Regulation through Comparative Nested Analysis of Representative Cases. Law & Policy, 31(2), pp.236–269.

February 2018