About Databrary

Databrary is a powerful tool for storing and sharing video data and documentation with other researchers. With the databraryr package, it becomes even more powerful. Rather than interact with Databrary through a web browser, users can write their own code to download participant data or even specific files.

I wrote databraryr so that I could better understand how the site works under the hood, and so that I could streamline my own analysis and data sharing workflows.

Let’s get started.

Registering

Access to most of the material on Databrary requires prior registration and authorization from an institution. The authorization process requires formal agreement by an institution. But you’ll create an account ID (email) and secure password when you register. Then, when you log in with your new credentials, you’ll select an existing institution (if yours is on the list), a new institution (if yours isn’t), or an existing authorized investigator (if you are a student, postdoc, or collaborator) to request authorization from.

Installation

Official CRAN release

Development release

v0.6.2

First steps (while you await authorization)

But even before formal authorization is complete, a user can access the public materials on Databrary. For this vignette, we’ll assume you fall into this category.

Once you’ve installed the package following one of the above routes, it’s a good idea to check that your installation worked by loading it into your local workspace.

library(databraryr)

Then, try this command to pull data about Databrary’s founders:

databraryr::list_people()
#> Downloading: 120 B     Downloading: 120 B     Downloading: 120 B     Downloading: 120 B     Downloading: 170 B     Downloading: 170 B     Downloading: 170 B     Downloading: 170 B     Downloading: 83 B     Downloading: 83 B     Downloading: 83 B     Downloading: 83 B
#>   id sortname prename                       affiliation
#> 1  5   Adolph   Karen               New York University
#> 2  6  Gilmore Rick O. The Pennsylvania State University
#> 3  7  Millman   David               New York University
#>                                url               orcid
#> 1 http://www.psych.nyu.edu/adolph/                <NA>
#> 2     http://gilmore-lab.github.io 0000-0002-7676-3982
#> 3                             <NA>                <NA>

Note that this command returns a data frame (tibble) with columns that include the first name (prename), last name (sortname), affiliation, lab or personal website, and ORCID ID if available.

Databrary assigns a unique integer for each person and institution on the system called a ‘party id’. When we run list_people(1:25) we are asking the system to provide us information about all of the people whose party id’s are between 1 and 25. Let’s try it:

databraryr::list_people(people_list = 1:25)
#> Downloading: 75 B     Downloading: 75 B     Downloading: 75 B     Downloading: 75 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 120 B     Downloading: 120 B     Downloading: 120 B     Downloading: 120 B     Downloading: 170 B     Downloading: 170 B     Downloading: 170 B     Downloading: 170 B     Downloading: 83 B     Downloading: 83 B     Downloading: 83 B     Downloading: 83 B     Downloading: 94 B     Downloading: 94 B     Downloading: 94 B     Downloading: 94 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 72 B     Downloading: 130 B     Downloading: 130 B     Downloading: 130 B     Downloading: 130 B     Downloading: 81 B     Downloading: 81 B     Downloading: 81 B     Downloading: 81 B     Downloading: 57 B     Downloading: 57 B     Downloading: 57 B     Downloading: 57 B     Downloading: 42 B     Downloading: 42 B     Downloading: 42 B     Downloading: 42 B     Downloading: 44 B     Downloading: 44 B     Downloading: 44 B     Downloading: 44 B     Downloading: 100 B     Downloading: 100 B     Downloading: 100 B     Downloading: 100 B     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB
#> People info: ■■■■■■■■■■■■■■■■■■■■■■■■■■        84% |  ETA:  0s
#> Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 66 B     Downloading: 66 B     Downloading: 66 B     Downloading: 66 B     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB     Downloading: 3.7 kB
#>    id        sortname       prename               orcid
#> 1   1           Simon         Dylan 0000-0002-2793-1679
#> 2   3         Steiger          Lisa                <NA>
#> 3   4           Byrne        Andrea                <NA>
#> 4   5          Adolph         Karen                <NA>
#> 5   6         Gilmore       Rick O. 0000-0002-7676-3982
#> 6   7         Millman         David                <NA>
#> 7  11   Tamis-LeMonda     Catherine                <NA>
#> 8  13             Roy Lina Wictoren                <NA>
#> 9  14        Franchak          John                <NA>
#> 10 16       Professor    Suzanne Q.                <NA>
#> 11 17 Jimenez-Robbins        Carmen                <NA>
#> 12 18             Coe           Jon                <NA>
#> 13 19             Foo         Vicky                <NA>
#> 14 20          Gordon         Peter                <NA>
#> 15 24            Chan        Gladys                <NA>
#>                              affiliation                              url
#> 1                                   <NA>                             <NA>
#> 2                              Databrary                             <NA>
#> 3                              Databrary                             <NA>
#> 4                    New York University http://www.psych.nyu.edu/adolph/
#> 5      The Pennsylvania State University     http://gilmore-lab.github.io
#> 6                    New York University                             <NA>
#> 7                    New York University                             <NA>
#> 8                                    NYU                             <NA>
#> 9    University of California, Riverside            http://padlab.ucr.edu
#> 10                             Databrary                             <NA>
#> 11                                  <NA>                             <NA>
#> 12                                  <NA>                             <NA>
#> 13                                  <NA>                             <NA>
#> 14 Teachers College, Columbia University                             <NA>
#> 15                                   NYU                             <NA>

It’s a bit slow, but you should see information about people beginning with Dylan Simon, the developer who designed and built most of the Databrary system, and ending with Gladys Chan, a graphic designer who created the Databrary and Datavyu logos and other graphic identity elements.

You can also try seeing what’s new on Databrary. The get_db_stats() command gives you information about the newly authorized people, institutions, and newly uploaded datasets. Try this:

databraryr::get_db_stats("stats")
#> # A tibble: 1 × 9
#>   date                investigators affiliates institutions datasets_total
#>   <dttm>                      <int>      <int>        <int>          <int>
#> 1 2024-03-26 09:05:40          1739        675          784           1665
#> # ℹ 4 more variables: datasets_shared <int>, n_files <int>, hours <dbl>,
#> #   TB <dbl>
databraryr::get_db_stats("people")
#> # A tibble: 5 × 6
#>      id sortname prename     affiliation               time                url  
#>   <int> <chr>    <chr>       <chr>                     <chr>               <chr>
#> 1  1261 Oudeyer  Pierre-Yves Inria, France             2024-03-25T19:34:3… <NA> 
#> 2 12387 Camodeca Marina      University of Udine       2024-03-25T19:14:5… <NA> 
#> 3 12346 Ledger   Susan       University of Newcastle   2024-03-25T15:48:0… <NA> 
#> 4 11934 Lang     Yue         Ocean University of China 2024-03-21T22:22:2… http…
#> 5 12465 Wang     Qian        ShanghaiTech University   2024-03-21T22:19:2… <NA>
databraryr::get_db_stats("institutions")
#> # A tibble: 18 × 13
#>       id sortname        prename affiliation time  url   institution name  body 
#>    <int> <chr>           <chr>   <chr>       <chr> <chr> <lgl>       <chr> <chr>
#>  1  1261 Oudeyer         Pierre… Inria, Fra… 2024… <NA>  NA          <NA>  <NA> 
#>  2 12508 National Insti… <NA>    <NA>        2024… http… TRUE        <NA>  <NA> 
#>  3 12387 Camodeca        Marina  University… 2024… <NA>  NA          <NA>  <NA> 
#>  4 12507 University of … <NA>    <NA>        2024… http… TRUE        <NA>  <NA> 
#>  5  1710 <NA>            <NA>    <NA>        2024… <NA>  NA          Denv… full…
#>  6 12346 Ledger          Susan   University… 2024… <NA>  NA          <NA>  <NA> 
#>  7  1709 <NA>            <NA>    <NA>        2024… <NA>  NA          Face… 4 an…
#>  8 11934 Lang            Yue     Ocean Univ… 2024… http… NA          <NA>  <NA> 
#>  9 12465 Wang            Qian    ShanghaiTe… 2024… <NA>  NA          <NA>  <NA> 
#> 10 12497 ShanghaiTech U… <NA>    <NA>        2024… http… TRUE        <NA>  <NA> 
#> 11  1708 <NA>            <NA>    <NA>        2024… <NA>  NA          SFARI ASD …
#> 12  1708 <NA>            <NA>    <NA>        2024… <NA>  NA          SFARI ASD …
#> 13  1708 <NA>            <NA>    <NA>        2024… <NA>  NA          SFARI ASD …
#> 14  1708 <NA>            <NA>    <NA>        2024… <NA>  NA          SFARI ASD …
#> 15  1708 <NA>            <NA>    <NA>        2024… <NA>  NA          SFARI ASD …
#> 16  1708 <NA>            <NA>    <NA>        2024… <NA>  NA          SFARI ASD …
#> 17  1706 <NA>            <NA>    <NA>        2024… <NA>  NA          Long… Data…
#> 18  1706 <NA>            <NA>    <NA>        2024… <NA>  NA          Long… Data…
#> # ℹ 4 more variables: creation <chr>, owners <list>, permission <int>,
#> #   publicsharefull <lgl>
databraryr::get_db_stats("datasets")
#> # A tibble: 10 × 8
#>       id name       body  creation owners       permission publicsharefull time 
#>    <int> <chr>      <chr> <chr>    <list>            <int> <lgl>           <chr>
#>  1  1710 Denver MA… full… 2024-03… <named list>          1 FALSE           2024…
#>  2  1709 Faces Mat… 4 an… 2024-03… <named list>          1 FALSE           2024…
#>  3  1708 SFARI      ASD … 2024-03… <named list>          1 TRUE            2024…
#>  4  1708 SFARI      ASD … 2024-03… <named list>          1 TRUE            2024…
#>  5  1708 SFARI      ASD … 2024-03… <named list>          1 TRUE            2024…
#>  6  1708 SFARI      ASD … 2024-03… <named list>          1 TRUE            2024…
#>  7  1708 SFARI      ASD … 2024-03… <named list>          1 TRUE            2024…
#>  8  1708 SFARI      ASD … 2024-03… <named list>          1 TRUE            2024…
#>  9  1706 Longitudi… Data… 2024-03… <named list>          1 FALSE           2024…
#> 10  1706 Longitudi… Data… 2024-03… <named list>          1 FALSE           2024…

Depending on when you run this command and how often, there may or may not be new items.

Next steps

To see more about how to access data on Databrary using databraryr visit the accessing data vignette.

To see how to log in and log out once you have authorization, see the vignette for authorized users.