Installing Dependencies

library(greta)

Why we need to install dependencies

The greta package uses Google’s TensorFlow (TF) and Tensorflow Probability (TFP)) under the hood to do efficient, fast, and scalable linear algebra and MCMC. TF and TFP are python packages, and so are required to be installed. This is different to how normal dependencies work with R packages, where the dependencies are automagically built and managed by CRAN.

Unfortunately, there isn’t an automatic, reliable way to ensure that these are provided along when you install greta, so we need to take an additional step to install them. We have tried very hard to make the process as easy as possible by providing a helper function, install_greta_deps().

How to install python dependencies using install_greta_deps()

We recommend running:

install_greta_deps()

And then following any prompts to install dependencies. You will then need to restart R and load library(greta) to start using greta.

How install_greta_deps() works

The install_greta_deps() function installs the Python dependencies TF and TFP. By default it installs versions TF 2.15.0, and TFP version 0.23.0. It places these inside a conda environment, “greta-env-tf2”. For the default settings, this is python 3.10. Using a conda environment isolates these exact python modules from other python installations, so only greta will see them.

We do this as it helps avoids installation issues, where previously you might update TF on your computer and overwrite the current version needed by greta. Using this “greta-env-tf2” conda environment means installing other python packages should not be impact the Python packages needed by greta. It is part of the recommended way to manage python dependencies in an R package as recommended by the team at Posit.

Using different versions of TF, TFP, and Python

The install_greta_deps() function takes three arguments:

  1. deps: Specify dependencies with greta_deps_spec()
  2. timeout: time in minutes to wait in installation before failing/exiting
  3. restart: whether to restart R (“force” - restart R, “no”, will not restart, “ask” (default) - ask the user)

You specify the version of TF TFP, or python that you want to use with greta_deps_spec(), which has arguments:

  • tf_version
  • tfp_version
  • python_version

If you specify versions of TF/TFP/Python that are not compatible with each other, it will error before starting installation. We determined the appropriate versions of Python, TF, and TFP from https://www.tensorflow.org/install/source#tested_build_configurations and https://www.tensorflow.org/install/source_windows#tested_build_configurations, and by inspecting TFP release notes. We put this information together into a dataset, greta_deps_tf_tfp. You can inspect this with View(greta_deps_tf_tfp).

If you provide an invalid installation versions, it will error and suggest some alternative installation versions.

How we install dependencies

For users who want to know more about the installation process of dependencies in greta.

We create a separate R instance using callr to install python dependencies using reticulate to talk to Python, and the R package tensorflow, for installing the tensorflow python module. We use callr so that we can ensure the installation of python dependencies happens in a clean R session that doesn’t have python or reticulate already loaded. It also means that we can hide the large amounts of text output to the console that happens when installation is running - these are written a logfile during installation that you can read with open_greta_install_log().

If miniconda isn’t installed, we install miniconda. You can think of miniconda as a lightweight version of python with minimal dependencies.

If “greta-tf2-env” isn’t found, then we create a new conda environment named “greta-tf2-env”, for a version of python that works with the specified versions of TF and TFP.

Then we install the TF and TFP python modules, using the versions specified in greta_deps_spec().

After installation, we ask users if they want to restart R. This only happens in interactive sessions, and only if the user is in RStudio. This is to avoid potential issues where this script might be used in batch scripts online.

Troubleshooting installation

Installation doesn’t always go to plan. Here are some approaches to getting your dependencies working.

  • Check you have restarted R after installing dependencies
    • After you have installed dependencies with install_greta_deps(), you will be prompted to restart R. To use greta you must restart R after installing dependencies as this allows greta to connect to the installed python dependencies.
  • Use greta_sitrep() to check dependencies.
    • greta_sitrep() will provide information about your installed version of Python, TF, TFP, and whether a conda environment is used. This can be helpful to troubleshoot some installation issues.
  • Check the installation logfile
    • During installation we write a logfile, which records all of the steps taken during installation. This can sometimes provide useful clues as to what might have gone awry during installation. You can open the logfile with open_greta_install_log(), which opens the logfile in a browser window, and scroll through it to try and find errors or things that went wrong during installation. We recommend viewing this with open_greta_install_log() and then searching with Ctrl/Cmd+F for things like “error/Error/ERROR/warn/etc” to find problems. There might not be a clear solution to the problem, but the logfile might provide clues to the problem that you can share on a forum or issue on the greta github.
  • Reinstall greta dependencies with reinstall_greta_deps()
    • Sometimes we just need to “turn it off and on again”. Use reinstall_greta_deps() to remove miniconda, and the greta conda environment, and install them again.
  • Manually remove python installation
    • You can manually remove python installation by doing:
      • remove_greta_env()
      • remove_miniconda()
      • or destroy_greta_deps(), which does both of these steps.
    • Then install the dependences with: install_greta_deps()
      • Note that this is functionally what reinstall_greta_deps() does, but sometimes it can be useful to separate them out into separate steps.
  • Check internet access
    • Installing these dependencies requires an internet connection, and sometimes the internet service provider (perhaps IT?) blocks websites like conda from downloading. In the past we have encountered this issue and have found that it can be avoided by doing re-installation with reinstall_greta_deps().

If the previous installation helper did not work, you can try the following:

reticulate::install_miniconda()
reticulate::conda_create(
        envname = "greta-env-tf2",
        python_version = "3.10"
      )
reticulate::conda_install(
        envname = "greta-env-tf2",
        packages = c(
          "tensorflow-probability==0.23.0",
          "tensorflow==2.15.0"
        )
      )

Which will install the python modules into a conda environment named “greta-env-tf2”.

You can also not install these not into a special conda environment like so:

reticulate::install_miniconda()
reticulate::conda_install(
        packages = c(
          "tensorflow-probability==0.23.0",
          "tensorflow==2.15.0"
        )
      )