U bent hier

Beschikbaarstellen

Goodbye Facebook. Hello Decentralized Social Media?

Internet Archive - 13 mei 2022 - 9:40pm

The pending sale of Twitter to Elon Musk has generated a buzz about the future of social media and just who should control our data.

Wendy Hanamura, director of partnerships at the Internet Archive, moderated an online discussion April 28 “Goodbye Facebook, Hello Decentralized Social Media?” about the opportunities and dangers ahead. The webinar is part of a series of six workshops, “Imagining a Better Online World: Exploring the Decentralized Web.” 

Watch the session recording:

The session featured founders of some of the top decentralized social media networks including Jay Graber, chief executive officer of R&D project Bluesky, Matthew Hodgson, technical co-founder of Matrix, and Andre Staltz, creator of Manyverse. Unlike Twitter, Facebook or Slack, Matrix and Manyverse have no central controlling entity. Instead the peer-to-peer networks shift power to the users and protect privacy. 

If Twitter is indeed bought and people are disappointed with the changes, the speakers expressed hope that the public will consider other social networks. “A crisis of this type means that people start installing Manyverse and other alternatives,” Staltz said. “The opportunity side is clear.” Still in the transition period if other platforms are not ready, there is some risk that users will feel stuck and not switch, he added.

Hodgson said there are reasons to be both optimistic and pessimistic about Musk purchasing Twitter. The hope is that he will use his powers for good, making it available to everybody and empowering people to block the content they don’t want to see. The risk is with no moderation, Hodgson said, people will be obnoxious to one another without sufficient controls to filter, and the system will melt down. “It’s certainly got potential to be an experiment. I’m cautiously optimistic on it,” he said.

People who work in decentralized tech recognize the risk that comes when one person can control a network and act for good or bad, Graber said. “This turn of events demonstrates that social networks that are centralized can change very quickly,” she said. “Those changes can potentially disrupt or drastically alter people’s identity, relationships, and the content that they put on there over the years. This highlights the necessity for transition to a protocol-based ecosystem.” 

When a platform is user-controlled, it is resilient to disruptive change, Graber said. Decentralization enables immutability so change is hard and is a slow process that requires a lot of people to agree, added Staltz.

The three leaders spoke about how decentralized networks provide a sustainable alternative and are gaining traction. Unlike major players that own user data and monetize personal information, decentralized networks are controlled by users and information lives in many different places.

“Society as a whole is facing a lot of crises,” Graber said. “We have the ability to, as a collective intelligence, to investigate a lot of directions at once. But we don’t actually have the free ability to fully do this in our current social architecture…if you decentralize, you get the ability to innovate and explore many more directions at once. And all the parts get more freedom and autonomy.”

Decentralized social media is structured to change the balance of power, added Hanamura: “In this moment, we want you to know that you have the power. You can take back the power, but you have to understand it and understand your responsibility.”

The webinar was co-sponsored by DWeb and Library Futures, and presented by the Metropolitan New York Library Council (METRO).

The next event in the series, Decentralized Apps, the Metaverse and the “Next Big Thing,” will be held Thursday, May 26 at 4-5 p.m.EST, Register here

The post Goodbye Facebook. Hello Decentralized Social Media? appeared first on Internet Archive Blogs.

Fireside Chat: Congressman Ro Khanna in conversation with Larry Lessig

Internet Archive - 10 mei 2022 - 9:02pm

Join us on Tuesday, May 31st at 6pm PT / 9pm ET for a fireside chat with Congressman Ro Khanna in conversation with Harvard Professor and author Lawrence Lessig to discuss Rep. Khanna’s book, Dignity in a Digital Age: Making Tech Work for All of Us.

In the Bay Area? You can join us at our San Francisco headquarters in person! Otherwise, tune in via Zoom. REGISTER NOW! Please note that our in-person seating is limited, so act fast to secure your spot.

You can reserve a copy of the book from our local bookseller, The Booksmith, by choosing the Add-On in the tickets section if you want to pick up your copy in person. Or if you want to order online and have the book shipped to your home, order here.

ABOUT THE BOOK:

In the digital age, unequal access to technology and the revenue it creates is one of the most pressing issues facing the United States. There is an economic gulf between those who have struck gold in the tech industry and those left behind by the digital revolution; a geographic divide between those in the coastal tech industry and those in the heartland whose jobs have been automated; and existing inequalities in technological access—students without computers, rural workers with spotty WiFi, and plenty of workers without the luxury to work from home.

Dignity in a Digital Age tackles these challenges head-on and imagines how the digital economy can create opportunities for people all across the country without uprooting them. Congressman Ro Khanna of Silicon Valley offers a vision for democratizing digital innovation to build economically vibrant and inclusive communities. Instead of being subject to tech’s reshaping of our economy, Representative Khanna argues that we must channel those powerful forces toward creating a more healthy, equal, and democratic society.

ABOUT RO KHANNA:
Ro Khanna represents Silicon Valley in Congress. He has taught economics at Stanford, served as Deputy Assistant Secretary of Commerce in the Obama Administration, and represented tech companies and startups in private practice. He is the author of Dignity in a Digital Age: Making Tech Work for All of Us. He enjoys spending time with his wife and two children in Washington, DC, and Fremont, California.

ABOUT LAWRENCE LESSIG:
Lawrence Lessig is the Roy L. Furman Professor of Law and Leadership at Harvard Law School. Prior to rejoining the Harvard faculty, where he was the Berkman Professor of Law until 2000, Lessig was a professor at Stanford Law School, where he founded the school’s Center for Internet and Society, and at the University of Chicago. Lessig clerked for Judge Richard Posner on the 7th Circuit Court of Appeals and Justice Antonin Scalia on the United States Supreme Court. He holds a BA in economics and a BS in management from the University of Pennsylvania, an MA in philosophy from Cambridge, and a JD from Yale.

The post Fireside Chat: Congressman Ro Khanna in conversation with Larry Lessig appeared first on Internet Archive Blogs.

New additions to the Internet Archive for April 2022

Internet Archive - 7 mei 2022 - 5:30pm

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons: 
  • Chris Cromwell Rare Reel to Reel Tapes – Rare and recovered reel-to-reel tapes from a variety of sources and preserved by Chris Cromwell. 
  • 1940s Classic TV – Television from the 1940s.
  • Game Shows Archive – A collection of game shows throughout television history, involving chance, skill and luck, usually presided over by a host and providing in-show commercials.
  • Dutch Television – Television programs and videos in the Dutch language, or from the Netherlands.
Books – 50,109 New items in April

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Zebra mussels and aquatic nuisance species Yang Fan shi jian = Intermission L’ appel de la forêt Zemsta budzika : opowiastki domowe Audio Archive – 150,224 New Items in April

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

Cabinet Live at Sherman Theater on 2022-04-01 Greyboy Allstars Live at the Fox Theater on 1998-07-30 Melvin Seals Live at 9:30 Club on 2022-04-12 LibriVox Audiobooks – 99 New Items in April

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

Oscar Wilde: The Complete Interviews Recruit for Andromeda A Book of Giants 78 RPMs and Cylinder Recordings – 6,745 New Items in April

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Das ist Paris, Paris FUMEE AUX YEUX Live Music Archive – 909 New Items in April

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

The Wolf Tones Live at the Thunderbird Café Porch Stage on 2022-04-26 Marco Benevento Live at All Good In The Woods X – Dreamland Studios, Hurley, NY on 2022-04-16 lespecial Live at Gramercy Theatre on 2022-04-16

Netlabels111 New Items in April

This collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of ‘virtual record labels’. These ‘netlabels’ are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres. Explore.

Movies – 55 New Items in April

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.

The post New additions to the Internet Archive for April 2022 appeared first on Internet Archive Blogs.

Library as Laboratory: New Lightning Talks Announced

Internet Archive - 6 mei 2022 - 12:27am

Register now for our Library as Laboratory Lightning Talks session, May 11 @ 11am PT.

In this final session of the Internet Archive’s digital humanities expo, Library as Laboratory, you’ll hear from scholars in a series of short presentations about their research and how they’re using collections and infrastructure from the Internet Archive for their work.

Register now!

Speakers include:

WARC Collection Summarization

Sawood Alam (Internet Archive)

Items in the Internet Archive’s Petabox collections of various media types like image, video, audio, book, etc. have rich metadata, representative thumbnails, and interactive hero elements. However, web collections, primarily containing WARC files and their corresponding CDX files, often look opaque. We created an open-source CLI tool called “CDX Summary” [1] to process sorted CDX files and generate reports. These summary reports give insights on various dimensions of CDX records/captures, such as, total number of mementos, number of unique original resources, distribution of various media types and their HTTP status codes, path and query segment counts, temporal spread, and capture frequencies of top TLDs, hosts, and URIs. We also implemented a uniform sampling algorithm to select a given number of random memento URIs (i.e., URI-Ms) with 200 OK HTML responses that can be utilized for quality assurance purposes or as a representative sample for the collection of WARC files. Our tool can generate both comprehensive and brief reports in JSON format as well as human readable textual representation. We ran our tool on a selected set of public web collections in Petabox, stored resulting JSON files in their corresponding collections, and made them accessible publicly (with the hope that they might be useful for researchers). Furthermore, we implemented a custom Web Component that can load CDX Summary report JSON files and render them in interactive HTML representations. Finally, we integrated this Web Component into the collection/item views of the main site of the Internet Archive, so that patrons can access rich and interactive information when they visit a web collection/item in Petabox. We also found our tool useful for crawl operators as it helped us identify numerous issues in some of our crawls that would have otherwise gone unnoticed.
[1] https://github.com/internetarchive/cdx-summary/ 

More Than Words: Fed Chairs’ Communication During Congressional Testimonies

Michelle Alexopoulos (University of Toronto)

 Economic policies enacted by the government and its agencies have large impacts on the welfare of businesses and individuals—especially those related to fiscal and monetary policy. Communicating the details of the policies to the public is an important and complex undertaking. Policymakers tasked with the communication not only need to present complicated information in simple and relatable terms, but they also need to be credible and convincing—all the while being at the center of the media’s spotlight. In this briefing, I will discuss recent research on the applications of AI to monetary policy communications, and lessons learned to date. In particular, I will report on my recent ongoing project with researchers at the Bank of Canada that analyzes the effects of emotional cues by the Chairs of the U.S. Federal Reserve on financial markets during congressional testimonies.  

While most previous work has mainly focused on the effects of a central bank’s highly scripted messages about its rate decisions delivered by its leader, we use resources from the Internet Archive, CSPAN and copies of testimony transcripts and apply a variety of tools and techniques to study the both the messages and the messengers’ delivery of them. I will review how we apply recent advances in machine learning and big data to construct measures of Federal Reserve Chair’s emotions, expressed via his or her words, voice, and face, as well as discuss challenges encountered and our findings to date. In all, our initial results highlight the salience of the Fed Chair’s emotional cues for shaping market responses to Fed communications. Understanding the effects of non-verbal communication and responses to verbal cues may help policy makers improve upon their communication strategies going forward.  

Digging into the (Internet) Archive: Examining the NSFW Model Responsible for the 2018 Tumblr Purge

Renata Barreto (University of California Berkeley)

In December 2018, Tumblr took down massive amounts of LGBTQ content from its platform. Motivated in part by increasing pressures from financial institutions and a newly passed law — SESTA / FOSTA, which made companies liable for sex trafficking online — Tumblr implemented a strict “not safe for work” or NSFW model, whose false positives included images of fully clothed women, handmade and digital art, and other innocuous objects, such as vases. The Archive Team, in conjunction with the Internet Archive, jumped into high gear and began to scrape self-tagged NSFW blogs in the 2 weeks between Tumblr’s announcement of its new policy and its algorithmic operationalization. At the time, Tumblr was considered a safe haven for the LGBTQ community and in 2013 Yahoo! bought Tumblr for 1.1 billion. In the aftermath of the so-called “Tumblr purge,” Tumblr lost its main user base and, as of 2019, was valued at 3 million. This paper digs into a slice of the 90 TB of data saved by the Archive Team. This is a unique opportunity to peek under the hood of Yahoo’s open_nsfw model, which experts believe was used in the Tumblr purge, and examine the distribution of false positives on the Archive Team dataset. Specifically, we run the open_nsfw model on our dataset and use the t-SNE algorithm to project the similarities across images on 3D space.

Japan As They Saw It (video)

Tom Gally (University of Tokyo)

“Japan As They Saw It” is a collection of descriptions of Japan by American and British visitors in the 1850s and later. Japan had been closed to outsiders for more than two centuries, and there was much curiosity in the West about this newly accessible country. The excerpts are grouped by category—Land, People, Culture, etc.—and each excerpt is linked to the book where it first appeared at the Internet Archive. “Japan As They Saw It” can be read online, or it can be downloaded as a free ebook.

Forgotten Novels of the 19th Century (video)

Tom Gally (University of Tokyo)

Novels were the binge-watched television, the hit podcasts of the 19th century—immersive, addictive, commercial—and they were produced and consumed in huge numbers. But many novels of that era have slipped through the cracks of literary memory. “Forgotten Novels of the 19th Century” is a list of fifty of those neglected novels, all waiting to be discovered and read for free at the Internet Archive.

Forgotten Histories of the Mid-Century Coding Bootcamp

Kate Miltner (University of Edinburgh)

Over the past 10 years, Americans have been exhorted to “learn to code” in order to solve a series of entrenched social issues: the tech “skills gap”, the looming threat of AI and automation, social mobility, and the underrepresentation of women and people of color in the tech industry. In response to this widespread discourse, an entire industry of short-term intensive training courses– otherwise known as coding bootcamps– have sprung up across the US, bringing in hundreds of millions of dollars in revenue a year and training tens of thousands of people. Coding bootcamps have been framed as a novel kind of institution that is equipped to solve contemporary problems. However, materials from the Internet Archive show us that, in fact, a similar discourse about computer programming and similar organizations called EDP schools existed over 70 years ago. This talk will showcase materials from the Ted Nelson Archive and the Computerworld archive to showcase how lessons from the past can inform the present.

Automatic scanning with an Internet Archive TT scanner (video)

Art Rhyno (University of Windsor)

The University of Windsor has set up a mechanism for automatic scanning with an Internet Archive TT scanner, used for the library’s Major Papers collection.

Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index

Spencer Torene (Thomson Reuters Special Services, LLC)

Developing  semantic  hierarchies  from  user-created  hashtags  in  social  media  can  provide  useful  organizational  structure  to  large  volumes  of  data.  However,  construction of  these  hierarchies  is  difficult  using  established  ontologies  (e.g.  WordNet)  due  to the differences in the semantic and pragmatic use of words vs. hashtags in social media. While alternative construction methods based on hashtag frequency are relatively straightforward, these methods can be susceptible to the dynamic nature of social media,  such  as  hashtags  associated  with  surges  in  popularity.  We  drew  inspiration  from the ecologically-based Shannon Diversity Index (SDI) to create a more representative and  resilient  method  of  semantic  hierarchy  construction  that  relies  upon  graph-based community detection and a novel, entropy-based ensemble diversity index (EDI) score. The EDI quantifies the contextual diversity of each hashtag, resulting in thousands of semantically-related groups of hashtags organized along a general-to-specific spectrum. Through an application of EDI to social media data (Twitter) and a comparison of our results to prior approaches, we demonstrate our method’s ability to create semantically consistent hierarchies that can be flexibly applied and adapted to a range of use cases.

Web and cities: (early internet) geographies through the lenses of the Internet Archive

Emmanouil Tranos (University of Bristol)

While geographers first turned their focus on the internet 25 years ago, the wealth of data that the Internet Archive preserves and offers remains at large unexplored, especially for large projects in terms of scope and geographical scale. However, there is hardly any other data source that depicts the evolution of our interaction with the digital and, importantly, the spatial footprint of this interaction better than the Internet Archive. Therefore, the last few years we have been using extensively data from the Internet Archive in order to understand the geography and the evolution of the creation of online content and their interrelation with cities and spatial structure. Specifically, we have worked with The British Library and utilised the JISC UK Web Domain Dataset (1996-2013)1 for a number of projects in order to (i) explore whether the availability of online content of local interest can attract individuals online, (ii) assess how the early engagement with web tools can affect future productivity, (iii) map the evolution of economic clusters, and (iv) predict interregional trade flows. The Internet Archive helps us not only to map the evolution and the geography of the engagement with the internet especially at its early stages and, therefore, draw important lessons regarding new future technologies, but also to understand economic activities that take place within and between cities.
1http://data.webarchive.org.uk/opendata/ukwa.ds.2/

The post Library as Laboratory: New Lightning Talks Announced appeared first on Internet Archive Blogs.

Dr. Abdul-Mageed Launches a New Platform for Deep Learning Training

Interpares Trust AI - 5 mei 2022 - 6:50pm
Assistant Professor Dr. Muhammad Abdul-Mageed and his team at the Deep Learning & Natural Language Processing (DLNLP) Group at UBC have recently launched Learnera.ai, a new educational platform for Deep Learning training.

Learnera will focus on offering hands-on Deep Learning (DL) training. DL is a class of machine learning methods inspired by information processing in the brain. It involves learning layered representations of richly complex concepts from lower-level, simpler concepts. DL is the technology behind many recent breakthroughs in communication, health, and manufacturing, among other fields. DL methods can help us analyze and better understand unstructured data by creating models that emulate human-decision making capabilities. All of these transformational changes are impacting broad sectors of our society.

Read more here.

Preserving Wilmington History on the Web

Internet Archive - 3 mei 2022 - 4:00pm

Guest Post by: Tricia Dean, Tech Services Manager at Wilmington Public Library District (IL)

This post is part of a series written by members of the Community Webs program. Community Webs advances the capacity for community-focused memory organizations to build web and digital archives documenting local histories and underrepresented voices. For more information, visit communitywebs.archive-it.org/

Wilmington Public Library. Photo: T. Dean 4/21/22

I was excited when I saw the call for participants in Community Webs. While Wilmington, Illinois is a small, rural town (5,664 people), the thought was that we still had something to contribute. Most Archive-It partners are universities, museums and large libraries, and being in their company was a little daunting to me initially. Other institutions have someone who opens the project, and then it develops into a larger team project. Wilmington Public Library District (WPLD) has a much smaller staff; the project has been wholly mine, which has been both thrilling and terrifying. 

Wilmington is a small rural town, falling on the lower end of the economic scale.  Because we are isolated,the library plays a vital part in the community.  We offer the usual storytimes and adult programs, but also loan out hotspots and ChromeBooks. We have 45 hotspots and these are almost always checked out; some people are using them for vacations, but by usage it is apparent that others are using them as their primary means of connecting to the Internet. Internet access has been more and more important, but after the Covid-19 broke out, more governmental services went strictly online, making access even more critical – and to many who had not been regular patrons. WPLD is a hub for the community, offering computers, information, tax forms, and a place to come in and chat – even more important when we are trying to stay close and limit outside contact.

Main Street in Wilmington, circa 1900

I am a Chicago native who went to Champaign-Urbana for grad school. I was a scanner for the Internet Archive for several years where I was privileged to handle some incunabula (pre-1500 items). I am the Technical Services Supervisor at Wilmington; primarily I catalog our materials, but I also tend toward Projects, from adding series labels to re-orienting all the calls in the juvenile non-fiction section.  I am currently going through our attic to help determine what we have (it’s a Mystery!). I’m making lists, and hoping to have items to scan which would be available online, in multiple places. I applied for the Community Webs program (with my director’s blessing) because I felt that it’s important for small towns to be represented in the collection of history. Only 20% of the population still lives outside major metro areas, but it is every bit as important to capture that life as it is to retain the history of large cities.

Wilmington Library joined Community Webs in the summer of 2021. After some technical clarifications with the Archive-It staff WLPD was set up. In considering what made Wilmington unique, the first link was to our library and social media pages. Social media has grown in importance in the last twenty years, but it became a vital link during Covid when services were otherwise unavailable. Wilmington Library YouTube videos, how-tos, crafts and storytime, stand to remind us of how we responded and as a continuing reference for parents who can’t get to the library. But since social media, specifically, is known for ‘right now,’ it lacks the kind of reflection over time that we can create through the Community Webs project.

We may be small, but we have a number of historical articles and sites which needed to be brought together. We want to reflect events that have been impactful to our community, from the explosion of the Joliet Armory in the 1940s to the continuing issues with the Wilmington Dam, which has proved dangerous, but has complicated ownership issues. I still have a long way to go; the projects (attic/local history/web archive) are all intertwined. Wilmington has the usual Community Resources and City Government collections in Archive-It. Going forward, we want to continue to develop our Wilmington History collection. We are working on local history and will establish a collection of materials from our attic and public donations. Our local paper has vertical files which could be a goldmine of information – again, on my to-do list.  We will be kicking off an Oral History Project, which will begin with a series of simple gatherings/coffee hours for our seniors, providing a place for them to gather, and a space to share their stories. I am hoping these will be in our Community Webs archive. Who better to speak to where we’ve been and where we are than some of our oldest residents?

Wilmington Dam (present). Photo: T. Dean 4/21/22 [Photo by John Irvine – Chicago Tribune – August 29, 1992]. Shallow appearing dam is still quite hazardous, partially because it doesn’t ‘look’ dangerous – photo long before warning signs went up.

Why is Community Webs important? Because it will help to remember when we cannot keep up with the information overload. Because there is so much happening that we miss a good deal of what is around us – or can’t bear to face it for long. Because so very very much of our lives are now online – and can be erased with a keystroke. Because we are seeing, painfully, that those who do not learn from the past will be/are condemned to re-live it. And, for Wilmington, I think it is important because so many of the voices and sites being captured are from museums, universities and large public libraries. It is important that we remember that we used to be far less urban than we are today. It is important to remember the smaller places, those who are too easily lost in the maelstrom of modern life, because to be forgotten is to be erased.

The post Preserving Wilmington History on the Web appeared first on Internet Archive Blogs.

Library as Laboratory Recap: Analyzing Biodiversity Literature at Scale

Internet Archive - 3 mei 2022 - 2:00pm

At a recent webinar hosted by the Internet Archive, leaders from the Biodiversity Heritage Library (BHL) shared how its massive open access digital collection documenting life on the planet is an invaluable resource of use to scientists and ordinary citizens.

“The BHL is a global consortium of the  leading natural history museums, botanical gardens, and research institutions — big and small— from all over the world. Working together and in partnership with the Internet Archive, these libraries have digitized more than 60 million pages of scientific literature available to the public”, said Chris Freeland, director of Open Libraries and moderator of the event.

Watch session recording:

Established in 2006 with a commitment to inspiring discovery through free access to biodiversity knowledge, BHL has 19 members and 22 affiliates, plus 100 worldwide partners contributing data. The BHL has content dating back nearly 600 years alongside current literature that, when liberated from the print page, holds immense promise for advancing science and solving today’s pressing problems of climate change and the loss of biodiversity.

Martin Kalfatovic, BHL program director and associate director of the Smithsonian Libraries and Archives, noted in his presentation that Charles Darwin and colleagues famously said “the cultivation of natural science cannot be efficiently carried on without reference to an extensive library.”

“Today, the Biodiversity Heritage Library is creating this global, accessible open library of literature that will  help scientists, taxonomists, environmentalists—a host of people working with our planet—to actually have ready access to these collections,” Kalfatovic said. BHL’s mission is to improve research methodology by working with its partner libraries and the broader biodiversity and bioinformatics community. Each month, BHL draws about 142,000 visitors and 12 million users overall.

“The outlook for the planet is challenging. By unlocking this historic data [in the Biodiversity Heritage Library], we can find out where we’ve been over time to find out more about where we need to be in the future.”

Martin Kalfatovic, program director, Biodiversity Heritage Library

Most of the BHL’s materials are from collections in the global north, primarily in large, well-funded institutions. Digitizing these collections helps level the playing field, providing researchers in all parts of the world equal access to vital content.

The vast collection includes species descriptions, distribution records, climate records, history of scientific discovery, information on extinct species, and records of scientific distributions of where species live. To date, BHL has made over 176,000 titles and 281,000 volumes available. Through a partnership with the Global Names Architecture project, more than 243 million instances of taxonomic (Latin) names have been found in BHL content.

Kalfatovic underscored the value of BHL content in understanding the environment in the wake of recent troubling news from the Sixth Assessment Report (AR6) published by the  Intergovernmental Panel on Climate Change about the impact of the earth’s warming. 

Biodiversity Heritage Library by the numbers.

“The outlook for the planet is challenging,” he said. “By unlocking this historic data, we can find out where we’ve been over time to find out more about where we need to be in the future.”

JJ Dearborn, BHL data manager, discussed how digitization transforms physical books into digital objects that can be shared with “anyone, at any time, anywhere.” She describes the Wikimedia ecosystem as “fertile ground for open access experimentation,” crediting the organization with giving BHL the ability to reach new audiences and transform its data into 5-star linked open data. “Dark data” that is locked up in legacy formats, JP2s, and OCR text are sources of valuable checklist, species occurrence, and event sampling data that the larger biodiversity community can use to improve humanity’s collective ability to monitor biodiversity loss and the destructive impacts of climate change, at scale.  

The majority of the world’s data today is siloed, unstructured, and unused, Dearborn explained. This “dark data” “represents an untapped resource that could really transform human understanding if it could be truly utilized,” she said. “It might represent a gestalt leap for humanity.” 

The event was the fifth in a series of six sessions highlighting how researchers in the humanities use the Internet Archive. The final session of the Library as Laboratory series will be a series of lightning talks on May 11 at 11am PT / 2pm ET—register now!

The post Library as Laboratory Recap: Analyzing Biodiversity Literature at Scale appeared first on Internet Archive Blogs.

Helping Ukrainian Scholars, One Book at a Time

Internet Archive - 2 mei 2022 - 2:00pm

The Internet Archive is proud to partner with Better World Books to support Ukrainian students and scholars. With a $1 donation at checkout during your purchase at betterworldbooks.com, you will help provide verifiable information to Ukrainian scholars all over the world through Wikipedia.

Since 2019, the Internet Archive has worked with the Wikipedia community to strengthen citations to published literature. Working in collaboration with Wikipedians and data scientists, Internet Archive has linked hundreds of thousands of citations in Wikipedia to books in our collection, offering Wikipedia editors and readers single-click access to the verifiable facts contained within libraries. 

Recently, our engineers analyzed the citations in the Ukrainian-language Wikipedia, and were able to connect citations to more than 17,000 books that have already been digitized by the Internet Archive, such as the page for Геноміка (English translation: Genomics), which links to a science textbook published in 2002. Through this work, we discovered that there are more than 25,000 additional books that we don’t have in our collection—and that’s where you can help! 

Now through the end of June, when you make a $1 donation at checkout during your purchase at betterworldbooks.com, your donation will go to acquire and digitize books that are cited in the Ukrainian-language Wikipedia. With your help, we can ensure that Ukrainian scholars and people studying Ukraine have access to authoritative, factual information about Ukrainian history and culture. 

Thank you for making a difference by buying books from Better World Books, and helping Ukrainian students and scholars with your donation.

The post Helping Ukrainian Scholars, One Book at a Time appeared first on Internet Archive Blogs.

Library as Laboratory: Opening Television News for Deep Analysis and New Forms of Interactive Search

Internet Archive - 25 april 2022 - 3:54pm

Watching a single episode of the evening news can be informative. Tracking trends in broadcasts over time can be fascinating. 

The Internet Archive has preserved nearly 3 million hours of U.S. local and national TV news shows and made the material open to researchers for exploration and non-consumptive computational analysis. At a webinar April 13, TV News Archive experts shared how they’ve curated the massive collection and leveraged technology so scholars, journalists and the general public can make use of the vast repository.

Roger Macdonald, founder of the TV News Archive, and Kalev Leetaru, collaborating data scientist and GDELT Project founder, spoke at the session. Chris Freeland, director of Open Libraries, served as moderator and Internet Archive founder Brewster Kahle offered opening remarks.

Watch video

“Growing up in the television age, [television] is such an influential, important medium—persuasive, yet not something you can really quote,” Kahle said. “We wanted to make it so that you could quote, compare and contrast.” 

The Internet Archive built on the work of the Vanderbilt Television Archive, and the UCLA Library Broadcast NewsScape to give the public a broader “macro view,” said Kahle. The trends seen in at-scale computational analyses of news broadcasts can be used to understand the bigger picture of what is happening in the world and the lenses through which we see the world around us.

In 2012, with donations from individuals and philanthropies such as the Knight Foundation, the Archive started repurposing the closed captioning data stream required of all U.S. broadcasters into a search index. “This simple approach transformed the antiquated experience of searching for specific topics within video,” said Macdonald, who helped lead the effort. “The TV caption search enabled discovery at internet speed with the ability to simultaneously search millions of programs and have your results plotted over time, down to individual broadcasters and programs.”

“[Television] is such an influential, important medium—persuasive, yet not something you can really quote. We wanted to make it so that you could quote, compare and contrast.”

Brewster Kahle, Internet Archive

Scholars and journalists were quick to embrace this opportunity, but the team kept experimenting with deeper indexing. Techniques like audio fingerprinting, Optical Character Recognition (OCR) and Computer Vision made it possible to capture visual elements of the news and improve access, Macdonald said. 

Sub-collections of political leaders’ speeches and interviews have been created, including an extensive Donald Trump Archive. Some of the Archive’s most productive advances have come from collaborating with outsiders who have requested more access to the collection than is available through the public interface, Macdonald said. With appropriate restrictions to maintain respect for broadcasters and distribution platforms, the Archive has worked with select scientists and journalists as partners to use data in the collection for more complex analyses.

Treating television news as data creates vast opportunities for computational analysis, said Leetaru. Researchers can track word frequency use in the news and how that has changed over time.  For instance, it’s possible to look at mentions of COVID-related words across selected news programs and see when it surged and leveled off with each wave before plummeting downward, as shown in the graph below.

The newly computed metadata can help provide context and assist with fact checking efforts to combat misinformation. It can allow researchers to map the geography of television news—how certain parts of the world are covered more than others, Leetaru said. Through the collections, researchers have explored  which presidential tweets challenging election integrity got the most exposure on the news.  OCR of every frame has been used to create models of how to identify names of every “Dr.” depicted on cable TV after the outbreak of COVID-19 and calculate air time devoted to the medical doctors commenting on one of the virus variants.  Reverse image lookup of images in TV news has been used to determine the source of photos and videos.  Visual entity search tools can even reveal the increasing prevalence of bookshelves as backdrops during home interviews in the pandemic, as well as appearances of books by specific authors or titles. Open datasets of computed TV news metadata are available that include all visual entity and OCR detections, 10-minute interval captioning ngrams and second by second inventories of each broadcast cataloging whether it was “News” programming, “Advertising” programming or “Uncaptioned” (in the case of television news this is almost exclusively advertising).

From television news to digitized books and periodicals, dozens of projects rely on the collections available at archive.org for computational and bibliographic research across a large digital corpus. Data scientists or anyone with questions about the TV News Archives, can contact info@archive.org.

Up Next

This webinar was the fourth a series of six sessions highlighting how researchers in the humanities use the Internet Archive. The next will be about Analyzing Biodiversity Literature at Scale on April 27. Register here.

The post Library as Laboratory: Opening Television News for Deep Analysis and New Forms of Interactive Search appeared first on Internet Archive Blogs.

What’s in your smart wallet? “Keeping your Personal Data Personal

Internet Archive - 21 april 2022 - 7:59pm

How Decentralized Identity Drives Privacy” with Internet Archive, Metro Library Council, and Library Futures

How many passwords do you have saved, and how many of them are controlled by a large, corporate platform instead of by you? Last month’s “Keeping your Personal Data Personal: How Decentralized Identity Drives Privacy” session started with that provocative question in order to illustrate the potential of this emerging technology.

Self-sovereign identity (SSI), defined as “an idea, a movement, and a decentralized approach for establishing trust online,” sits in the middle of the stack of technologies that makes up the decentralized internet. In the words of the Decentralized Identity Resource Guide written specifically for this session, “self-sovereign identity is a system where users themselves–and not centralized platforms or services like Google, Facebook, or LinkedIn–are in control and maintain ownership of their personal information.”

  Research shows that the average American has more than 150 different accounts and passwords – a number that has likely skyrocketed since the start of the pandemic. In her presentation, Wendy Hanamura, Director of Partnerships at the Internet Archive, discussed the implications of “trading privacy and security for convenience.” Hanamura drew on her recent experience at SXSW, which bundled her personal data, including medical and vaccine data, into an insecure QR code used by a corporate sponsor to verify her as a participant. In contrast, Hanamura says that the twenty-year old concept of self-sovereign identity can disaggregate these services from corporations, empowering people to be in better control of their own data and identity through principles like control, access, transparency, and consent. While self-sovereign identity presents incredible promise as a concept, it also raises fascinating technical questions around verification and management.

For Kaliya “Identity Woman” Young, her interest in identity comes from networks of global ecology and information technology, which she has been part of for more than twenty years. In 2000, when the Internet was still nascent, she joined with a community to ask: “How can this technology best serve people, organizations, and the planet?” Underlying her work is the strong belief that people should have the right to control their own online identity with the maximum amount of flexibility and access. Using a real life example, Young compared self-sovereign identity to a physical wallet. Like a wallet, self-sovereign identity puts users in control of what they share, and when, with no centralized ability for an issuer to tell when the pieces of information within the wallet is presented.

In contrast, the modern internet operates with a series of centralized identifiers like ICANN or IANA for domain names and IP addresses and corporate private namespaces like Google and Facebook. Young’s research and work decentralizes this way of transmitting information through “signed portable proofs,” which come from a variety of sources rather than one centralized source. These proofs are also called verifiable credentials and have metadata, the claim itself, and a digital signature embedded for validation. All of these pieces come together in a digital wallet, verified by a digital identifier that is unique to a person. Utilizing cryptography, these identifiers would be validated by digital identity documents and registries. In this scenario, organizations like InCommon, an access management service, or even a professional licensing organization like the American Library Association can maintain lists of institutions that would be able to verify the identity or organizational affiliation of an identifier. In the end, Young emphasized a message of empowerment – in her work, self-sovereign identity is about “innovating protocols to represent people in the digital realm in ways that empower them and that they control.”

Next, librarian Lambert Heller of Technische Bibliothek and Irene Adamski of the Berlin-based SSI firm Jolocom discussed and demonstrated their work in creating self-sovereign identity for academic conferences on a new platform called Condidi. This tool allows people running academic events to have a platform that issues digital credentials of attendance in a decentralized system. Utilizing open source and decentralized software, this system minimizes the amount of personal information that attendees need to give over to organizers while still allowing participants to track and log records of their attendance. For libraries, this kind of system is crucial – new systems like Condidi help libraries protect user privacy and open up platform innovation.

Self-sovereign identity also utilizes a new tool called  a “smart wallet,” which holds one’s credentials and is controlled by the user. For example, at a conference, a user might want to tell the organizer that she is of age, but not share any other information about herself. A demo of Jolocom’s system demonstrated how this system could work. In the demo, Irene showed how a wallet could allow a person to share just the information she wants through encrypted keys in a conference situation. Jolocom also allows people to verify credentials using an encrypted wallet. According to Adamski, the best part of self sovereign identity is that “you don’t have to share if you don’t want to.” This way, “I am in control of my data.”

To conclude, Heller discussed a recent movement in Europe called “Stop Tracking Science.” To combat publishing oligopolies and data analytics companies, a group of academics have come together to create scholar-led infrastructure. As Heller says, in the current environment, “Your journal is reading you,” which is a terrifying thought about scholarly communications.

These academics are hoping to move toward shared responsibility and open, decentralized infrastructure using the major building blocks that already exist. One example of how academia is already decentralized is through PIDs, or persistent identifiers, which are already widely used through systems like ORCID. According to Heller, these PIDs are “part of the commons” and can be shared in a consistent, open manner across systems, which could be used in a decentralized manner for personal identity rather than a centralized one. To conclude, Heller said, “There is no technical fix for social issues. We need to come up with a model for how trust works in research infrastructure.”

It is clear that self-sovereign identity holds great promise as part of a movement for technology that is privacy-respecting, open, transparent, and empowering. In this future, it will be possible to have a verified identity that is held by you, not by a big corporation – the vision that we are setting out to achieve. Want to help us get there? 

Join us at the next events hosted by METRO Library Council, Internet Archive, and Library Futures. https://metro.org/decentralizedweb

Links Shared

Links shared:
Resource guide for this session: https://archive.org/details/resource-guide-session-03-decentralized-identity
All resource guides: https://metro.org/DWebResourceGuides
Decentralized ORCID: https://whoisthis.wtf
Internet Identity Workshop: https://internetidentityworkshop.com/
Jolocom: https://jolocom.io/
Condidi: https://labs.tib.eu/info/en/project/condidi/
TruAge: https://www.convenience.org/TruAge/Home
DIACC Trust Framework: https://diacc.ca/trust-framework/
PCTF-CCP https://canada-ca.github.io/PCTF-CCP
TruAge Digital ID Verification Solution: https://www.convenience.org/Media/Daily/2021/May/11/2-TruAgeTM-Digital-ID-Verification-Solution_NACS
NuData Security: https://nudatasecurity.com/passive-biometrics/
Kaliya Young’s Book, Domains of Identity: https://identitywoman.net/wp-content/uploads/Domains-of-Identity-Highlights.pdf

The post What’s in your smart wallet? “Keeping your Personal Data Personal appeared first on Internet Archive Blogs.

Supporting Ukrainian Scholars Through Interlibrary Loan

Internet Archive - 15 april 2022 - 2:39am

Internet Archive’s full collection of books and periodicals are now available, for free, to Ukrainian libraries through interlibrary loan (ILL) via RapidILL. Scholars who request materials through ILL get PDFs of articles and book chapters from the Internet Archive’s full collections, usually in under an hour. Libraries can learn more and sign up for access here.

The post Supporting Ukrainian Scholars Through Interlibrary Loan appeared first on Internet Archive Blogs.

Sharing Inuit Voices Across Time: Inuit Circumpolar Council Alaska’s Web and Digital Archive

Internet Archive - 13 april 2022 - 8:00pm

Guest post by: Inuit Circumpolar Council Alaska

This post is part of a series written by members of Internet Archive’s Community Webs program. Community Webs advances the capacity for community-focused memory organizations to build web and digital archives documenting local histories and underrepresented voices. For more information, visit communitywebs.archive-it.org/

Can you describe your community and the services and role of your organization within the community?

Inuit Circumpolar Council (ICC) Alaska works on behalf of the Inupiat of the North Slope, Northwest and Bering Straits Regions; St. Lawrence Island Yupik; and the Central Yup’ik and Cup’ik of the Yukon-Kuskokwim Region in Southwest Alaska. ICC Alaska is a national member of ICC International. Since inception in 1977, ICC has gained consultative status II with the United Nations, and is a Permanent Participant of the Arctic Council.

For example, ICC has provisional status with the International Maritime Organization (IMO), is an active member at the Arctic Council senior level and within the working groups and is a prominent voice at the UN Framework Convention on Climate Change (UNFCCC). Work and engagement occur in many ways at these different Fora. Within the UNFCCC, ICC has taken a leadership role in putting forward Indigenous Knowledge and establishing a platform for providing equitable space for multiple knowledge systems. Additionally, at the UNFCCC COP 26, ICC Chair, Dr. Dalee Sambo Dorough, led an ICC delegation made up of Inuitrepresentatives from across the Arctic.

ICC COP26 position paper, available at https://iccalaska.org/wp-icc/wp-content/uploads/2021/10/20211028-en-ICC-COP26-Position-Paper.pdf

An immense amount of work occurs in direct partnership with Inuit communities to inform work at international fora. For example, ICC is facilitating the development of international protocols for Equitable and Ethical Engagement. These protocols will provide a pathway to success for all that want to work within Inuit homelands and whose work impacts the Arctic. The protocols will aid in a paradigm shift in how work, decisions, and policies are currently created and carried out. The paradigm shift will lead toward greater equity and recognition of Inuit sovereignty and Self-determination.

Why was your organization interested in participating in Community Webs? 

The Community Webs program was attractive to ICC because it provided the training and the storage to effectively preserve ICC’s digitized & born-digital archival materials. We were pleased to see this offering as a solution for an ongoing desire to archive the prolific organization’s digital materials & products. This work dovetails nicely with ICC Alaska’s efforts to digitize 47 boxes, or around 80 linear feet of material that span 6 decades, including audio, film, photographic media, and paper documents.

ICC Jam – part 2 – Greenland

Cultural programming as part of the 1983 General Assembly. In this clip, view performances from Greenland’s Tuktak Theater and a Greenlandic choir

ICC advocates for Inuit and Inuit way of life, highlighted by ICC’s General Assembly meetings. The ICC receives its mandate from a General Assembly held every four years. The General Assembly is the heart of the organization, providing an opportunity for sharing information, discussing common concerns, debating issues, and strengthening the unity between all Inuit across our homelands. Through the Community Webs project, ICC Alaska has been able to preserve archival video of the ICC General Assemblies going back 30 years using Archive-It and the Internet Archive, as well as all newsletters, press releases, resolutions, social media campaigns, and reports published on its website. These are a significant record of ICC advocacy, but more importantly, Inuit political and cultural heritage.

Moses Wassillie’s Oral History of first ICC General Assembly in 1977, available at: http://oralhistory.library.uaf.edu/88/88-49-114_T01.pdf

Why do you think it is important for public libraries, community archives, and other local and community-based organizations to do this work?

Community-based organizations are uniquely positioned as both a part of and apart from the community. This vantage point allows for the self-reflection and observation needed for web archiving, as well as the relationships within the community to create the space and dialogue needed for community archiving projects. By building more capacity within community-based organizations for web archiving and digital preservation efforts, we can expand the recorded historical narrative and humanities-based inquiries in a multitude of directions, to truly reflect the diversity of our world & time.

Where do you hope to see your web archiving program going?

The core goal of this work is to make ICC documents and its historical narrative more accessible and discoverable within ICC, to ICC’s member organizations, international bodies, and researchers, our aspirations are much bigger. Our hope is that this web archive goes beyond the core goal to inspire, delight, hearten, inform, and add depth to the conversations Inuit are having about cultural identity, relationship to the land, hunting, advocacy, self-determination, and self-governance. 

We are curious about the intangible outcomes: What new work does the archive inspire? How does the archive add depth & historical weight to existing projects, discussions, and advocacy? What stories and knowledge gets re-remembered, or re-investigated after viewing archival materials? What advocacy, ethics, and philosophical works come from Inuit leaders informed by the legacy that the archive shared? Are youth leaders interested in adding to the archive?

Is there anything you would like your organization to contribute back to the broader community of web archiving and/or local history in the form of documentation, workflows, policy drafts or other resources?

We have several aspirations. Firstly, it is the telling of Inuit stories. The archive is another manifestation of that mission – to record and share Inuit voices across time. To increase access to those voices, information, knowledge, and history. The ICC Archival holdings are a historically unique & culturally significant telling of Inuit cultural heritage, history (including political history), educational pedagogy, philosophy, self-determination, values, ethics, environmental stewardship, and Indigenous Knowledge. It is important to create a way for Inuit to discover and interact with this work. Community Webs has offered a new tool in our toolkit.

Secondly, the goal is to move forward conversations about categorization and information management for indigenous communities. What does that look like in best practice? Can we, together with other Inuit archives, improve on existing practices to create a more equitable and ethical engagement with Inuit-produced information, the management of that information, and the discovery and access of that information.

What are you most excited to learn through your participation in Community Webs?

It was exciting to discover that many Inuit and Alaska Native resources that have already been preserved using the Internet Archive. These resources are often affected by insufficient financial support. Being able to have a preserved and accessible copy of these resources is an important step towards creating the bigger picture of the historical record of Inuit advocacy. As part of the Community Webs meetings, it was exciting to hear from other tribal librarians and community archivists across the country & world. Additionally, it was exciting to hear from speakers whose work informs our community archival work at ICC Alaska – such as Chaitra Powell who created (among other amazing things) the “Archive in a Backpack” project.

What impact do you think web archiving could have within your community?

Hopefully this work inspires other organizations to also preserve their digital assets, creating a richer narrative of Inuit political and cultural heritage.

What do you foresee as some of the challenges you may face?

We are eager to preserve our social media channels that have replaced the DRUM newsletter as a vehicle for keeping our community up-to-date on ICC’s work. Ongoing challenges with Facebook and Instagram archiving are preventing us from doing that. Hopefully these issues are resolved in the favor of the communities who created the content and bring their community and connections to these software platforms.

The post Sharing Inuit Voices Across Time: Inuit Circumpolar Council Alaska’s Web and Digital Archive appeared first on Internet Archive Blogs.

Meet the Librarians: Alexis Rossi, Media & Access

Internet Archive - 13 april 2022 - 2:00pm

To celebrate National Library Week 2022, we are taking readers behind the scenes to Meet the Librarians who work at the Internet Archive and in associated programs.

Alexis Rossi has always loved books and connecting others with information. After receiving her undergraduate degree in English and creative writing, she became a book editor and then worked in online news. 

Alexis Rossi

In 2006, Rossi joined the staff of the Internet Archive. She was working on the launch of the Open Library project when she recognized the need to learn more about how to best organize materials. She enrolled at San Jose State University and earned her Master’s of Library and Information Science in 2010.

“It gave me a better grasp of how to hierarchically organize information in a way that is sensible and useful to other libraries,” Rossi said. “It also gave me better familiarity with how other more traditional libraries actually work—the types of data and systems they use.”

Rossi concentrated on web interfaces for library information, understanding digital metadata, and how to operate as a digital librarian. In addition to overseeing the Open Library project, at the Internet Archive, Rossi managed a revamp of the organization’s website, ran the Wayback Machine for four years, founded the webwide crawling program, and is currently a librarian and director of media & access.

“One of the themes of my life is trying to empower people to do whatever they want to do,” said Rossi, who grew up in Monterey, California, and now lives in San Francisco. “Giving people the resources to teach themselves—whatever they want to learn—is my driving force.”

“Giving people the resources to teach themselves—whatever they want to learn—is my driving force.”

Alexis Rossi, Media & Access

Rossi acknowledges she is privileged to have means to avail herself to an abundance of information, while many in other parts of the world do not. There are so many societal problems she cannot solve, Rossi said, but she believes her work is making a contribution.  

“We can build a library that allows people to access information for free, wherever they are, and however they can get to it, in whatever way. That, to me, is incredibly important,” Rossi said. It’s also rewarding to help patrons discover new information and recover materials they may have thought were lost, she added.

When she’s not working, Rossi enjoys making funky jewelry and elaborate cakes (a skill she learned on YouTube).

Among the millions of items and collections in the Internet Archive, what is Rossi’s favorite? Video and audio recordings of her dad, now 73, playing the piano, organ and accordion: “It’s just so good. It’s such a perfect little piece of history.”

The post Meet the Librarians: Alexis Rossi, Media & Access appeared first on Internet Archive Blogs.

Brainstorming UNESCO AI Ethics Recommendations

Interpares Trust AI - 11 april 2022 - 7:16pm
The General Assembly has adopted, in November 2021, the Recommendation on the Ethics of Artificial Intelligence. This document is the first global instrument regulating Artificial Intelligence (AI). It is according to its preamble “a standard-setting instrument developed through a global approach, based on international law, focusing on human dignity and human rights, as well as gender equality, social and economic justice and development, physical and mental well-being, diversity, interconnectedness, inclusiveness, and environmental and ecosystem protection”. It aims at guiding AI technologies conception and use in a responsible direction and intends to provide universal values and principles as regards their development.

You are invited to a 3 hours “brainstorming” session, the April 20, from 13:00 to 16:00 CET, where the guiding ideas of the Recommendation will be presented. This brainstorming will be followed by regional conferences or seminars on specific themes. These regional activities will ensure the original appropriation of the text, taking fully into account the regional context and cultural specificities. Invitation to these regional activities will follow.

https://events.unesco.org/event?id=Brainstorming_on_the_UNESCO_AI_Ethics_Recommendation3733139375&lang=1033

Library as Laboratory Recap: Curating the African Folktales in the Internet Archive’s Collection

Internet Archive - 11 april 2022 - 5:34pm

Laura Gibbs and Helen Nde share a passion for African folktales. They are both active researchers and bloggers on the subject who rely on the Internet Archive’s extensive collection in their work.

In the third of a series of webinars highlighting how researchers in the humanities use the Internet Archive, Gibbs and Nde spoke on March 30 about how they use the online library and contribute to its resources.

Watch now:

Gibbs was teaching at the University of Oklahoma in the spring of 2020 when the campus library shut down due to the pandemic. “That’s when I learned about controlled digital lending at the Internet Archive and that changed everything for me. I hadn’t realized how extensive the materials were,” said Gibbs, who was trained as a folklorist. She retired last May and began a project of cross-referencing her bookshelves of African and African-American folktales to see how many were available at the Internet Archive. Being able to check out one digital title at a time through controlled digital lending (CDL) opened up new possibilities for her research. 

“It was just mind boggling to me and so exciting,” she said of discovering the online library. “I want to be a provocation to get other people to go read, do their own writing and thinking from books that we can all access. That’s what the Internet Archive has miraculously done.”

A Reader’s Guide to African Folktales at the Internet Archive by Laura Gibbs. Now available.

Gibbs said it has been very helpful to use the search function using the title of a book, name of an illustrator or some other kind of detail. With an account, the user can see the search results and borrow the digital book through CDL. “It’s all super easy to do. And if you’re like me and weren’t aware of the amazing resources available through controlled digital lending, now is the time to create your account at the Internet Archive,” Gibbs said. 

Every day, Gibbs blogs about a different book and rewrites a 100-word “tiny-tale” synopsis. In less than a year, she compiled A Reader’s Guide to African Folktales at the Internet Archive, a curated bibliography of hundreds of folktale books that she has shared with the public through the Internet Archive. Some are in the public domain, but many are later works and only available for lending one copy at a time through CDL. 

In her work, Nde explores mythological folklore from the African continent and is dedicated to preserving the storyteller traditions of African peoples, which is largely oral culture. Nde maintains the Mythological Africans website where she hosts storytelling sessions, modern lectures, and posts essays.

“[The Internet Archive] is an amazing resource of information online, which is readily available, and really goes to dispel the notion that there is no uniformity of folklore from the African continent,” Nde said. “Through Mythological Africans, I am able to share these stories and make these cultures come alive as much as possible.”

As an immigrant in the United States from Cameroon, Nde began to research the topic of African folklore because she was curious about exploring her background and identity. She said she found a community and a creative outlet for examining storytelling, poetry, dance and folktales. Nde said examining Gibb’s works gave her an opportunity to reconnect with some of the favorite books from her childhood. She’s also discovered reference books through the Internet Archive collection that have been helpful. Nde is active on social media (Twitter.com/mythicafricans) and has a YouTube channel on African mythology. She recently collaborated on a project with PBS highlighting the folklore behind an evil entity called the Adze, which can take the form of a firefly. 

The presenters said when citing material from the Internet Archive, not only can they link to a source, a blog or an academic article, they can link to the specific page number in that source. This gives credit to the author and also access to that story for anybody who wants to read it for themselves.

The next webinar in the series, Television as Data: Opening TV News for Deep Analysis and New Forms of Interactive Search, on April 13 will feature Roger MacDonald, Founder of the TV News Archive and Kalev Leetaru, Data Scientist at GDELT. Register now.

The post Library as Laboratory Recap: Curating the African Folktales in the Internet Archive’s Collection appeared first on Internet Archive Blogs.

Meet the Librarians: Lisa Seaberg, Patron Services & Open Library

Internet Archive - 11 april 2022 - 4:08pm

To celebrate National Library Week 2022, we are taking readers behind the scenes to Meet the Librarians who work at the Internet Archive and in associated programs.

Like any good librarian, Lisa Seaberg of the Internet Archive’s patron services team is prepared to answer the question: Can you recommend a book? In fact, Seaberg has 1,729 suggestions. She has organized what she wants to read in a publicly available list on Open Library.

Lisa Seaberg

“I’ve had a lifelong interest in reading and books,” said Seaberg, who worked as an assistant in her high school library in Milford, Connecticut. It was there that a mentoring librarian helped shape her taste in reading and introduced her to The Hitchhiker’s Guide to the Galaxy by Douglas Adams. 

Seaberg went on to earn her bachelor’s degree in library science from Southern Connecticut State University in 1996. She learned about the book publishing industry, practical skills of cataloguing, Boolean searching, and managing databases. She later earned a master’s degree in digital media from Quinnipiac University in Connecticut.

In 2017, Seaberg began to volunteer with Open Library and was hired to join the Internet Archive staff in 2020 to work for patron services. Based in Amsterdam, she responds to email requests to connect users with resources and helps coordinate a team of more than 200 volunteers to fix metadata issues. Seaberg works to maintain the digital collection, identify duplicates, and make sure the record represents the available books. She also fulfills interlibrary loan requests, as part of the Internet Archive’s new ILL service.

“It’s rewarding to make something discoverable.”

Lisa Seaberg, Patron Services & Open Library

Prior joining the Internet Archive, Seaberg worked at Gateway Computers in the late 90s where she gained useful technology experience. She later worked in communications for a hospital, managing its website. Those positions provided her with a sense of information architecture, she said, that she has applied to her work at the Internet Archive.

Lisa Seaberg

Seaberg said she is fascinated by everything that the Internet Archive provides to the public. In her job, she enjoys working with the book metadata. “It’s rewarding to make something discoverable,” she said. If people have an author they like, Seaberg tries to make sure there are subject headings and tags to make it easier for them to find related materials of interest. 

Recently, Seaberg said, it’s been meaningful to be involved in efforts to provide access to books being challenged by local school districts because of controversial content. She’s helped assemble digital collections of titles being targeted to ensure continuous access should an entity decide to ban them. 

When Seaberg is not working, she loves to play board games—gravitating to hobbyist, European games such as the Gaia Project, the complex, economy-building game that takes place in space. Her other main hobby is book hunting at charity shops and openbare boekenkastjes (free libraries) in and around her home in Amsterdam. Since Seaberg has limited shelf space, she sticks to her rule of only buying books that are on her Open Library Want To Read list.  

Among her favorite projects when it comes to the Internet Archive collection: Organizing the profiles of individual authors to make sure their works are all consolidated and easy to find for patrons. 

The post Meet the Librarians: Lisa Seaberg, Patron Services & Open Library appeared first on Internet Archive Blogs.

Meet the Librarians: Sawood Alam, Wayback Machine

Internet Archive - 8 april 2022 - 2:00pm

To celebrate National Library Week 2022, we are taking readers behind the scenes to Meet the Librarians who work at the Internet Archive and in associated programs.

Sawood Alam was born and raised on a farm in a remote village of India with no smartphones, television or electricity. 

Sawood Alam

“Books were one of the only means of learning and entertainment for us,” said Alam, who checked out as many books as he could from his school library every Thursday. “I had to take my buffalo out every afternoon. It was a boring task out in the field with no one to talk to, so books were my companions.”

When he was 10 years old, Alam helped at his school library, which was all run by children. He said he learned a lot about sorting, indexing and categorizing books—the beginning of a lifelong passion.  

Nearly two decades later, Alam completed his PhD in computer science with a specialty in web archiving from Old Dominion University. He was part of the Web Science and Digital Libraries Research Group at the university. 

Alam joined the staff of the Internet Archive as a web and data scientist in 2020. Working with the Wayback Machine team, Alam supports researchers from all around the world conducting analyses with Internet Archive collections. When someone has a research question that involves interaction with Wayback Machine APIs or downloading a large number of archived web pages, he helps prepare the data and provides technical assistance. Alam tries to improve the discoverability of items in massive web collections. His data insights and quality assurance efforts enhance web crawling and Wayback Machine operations.

Alam also collaborates with partners from academia, industry, and organizations on various research, development and standardization efforts. His own research has focused on archive profiling, interoperability and cooperation among archives, which are all topics the data scientist writes about and shares on Twitter.

“My first language is Urdu so when I see books and materials in Urdu in the Internet Archive it brings me joy.”

Sawood Alam, Wayback Machine

Formal academic training in the field of web archiving is uncommon, said Alam. With his background, he’s able to understand the data scientists’ research needs, he said, making his skills a perfect match for his position at the Internet Archive. 

“‘Universal Access to All Knowledge’ is something that certainly resonates for me,” Alam said of the Internet Archive’s mission. “I would like to focus on making it more global.”

Sawood Alam

In recognition of his contribution to the library community with digital preservation, Alam received the NDSA 2020 Future Stewards Innovation Award.

Beyond his work at the Internet Archive, Alam serves the digital library and web archiving communities by peer-reviewing research papers and chairing sessions in journals and conferences in the fields of his interest and participating in conversations of International Internet Preservation Consortium (IIPC) with focus towards interoperability, collaborations, and other related topics.

Favorite items in the Internet Archive for Alam? “I established a volunteer-driven online Unicode Urdu books library, UrduWeb Digital Library, during my graduation years. My first language is Urdu so when I see books and materials in Urdu in the Internet Archive it brings me joy. Thanks to the Wayback Machine, I was able to narrate the lost story of the evolution of Urdu blogging on the 20th anniversary of the Internet Archive.”

The post Meet the Librarians: Sawood Alam, Wayback Machine appeared first on Internet Archive Blogs.

New additions to the Internet Archive for March 2022

Internet Archive - 7 april 2022 - 5:00pm

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons:  Books – 60,379 New items in March

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Martin’s mice Castles Pygmy goats Audio Archive – 93,954 New Items in March

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

Nthn@All : Miraicult Refactor Neal Francis Live at Terminal West on 2022-02-26 Leftover Salmon Live at Crystal Ballroom on 1997-03-29 LibriVox Audiobooks – 122 New Items in March

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

Short Poetry Collection 225 The Old House The Wreck of the Corsaire 78 RPMs and Cylinder Recordings – 7,423 New Items in March

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Culbertson Sasha – Aeolian Vocalion D 02143 1923


Magyar dis örség felváltás
ONE RAINDROP DOESN’T MEAN A SHOWER Live Music Archive – 1,098 New Items in March

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

Goose Live at 9:30 Club on 2022-03-02 Circles Around The Sun Live at Infinity Hall on 2022-03-17

Netlabels186 New Items in March

This collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of ‘virtual record labels’. These ‘netlabels’ are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres. Explore.

Dedicated Rough Memory Glamour Glide Tracks Movies – 25 New Items in March

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.

The post New additions to the Internet Archive for March 2022 appeared first on Internet Archive Blogs.

Building the Collective COVID-19 Web Archive

Internet Archive - 6 april 2022 - 6:00pm

The COVID-19 pandemic has been life-changing for people around the globe. As efforts to slow the progress of the virus unfolded in early 2020, librarians, archivists and others with interest in preserving cultural heritage began considering ways to document the personal, societal, and systemic impacts of the global pandemic. These collections  included preserving physical, digital and web-based information and artifacts for posterity and future research use. 

Clockwise from top left: blog post about local artists making masks from Kansas City Public Library’s “COVID-19 Outbreak” collection; youth vaccination campaign website from American Academy of Pediatrics’ “AAP COVID” collection, COVID-19 case dashboard from Carnegie Mellon University’s “COVID-19” collection and COVID-19 FAQs from Library of Michigan’s “COVID-19 in Michigan” collection.

In response, the Internet Archive’s Archive-It service launched a COVID-19 Web Archiving Special Campaign starting in April 2020 to allow existing Archive-It partners to increase their web archiving capacity or new partners to join to collect COVID-19 related content. In all, more than 100 organizations took advantage of the COVID-19 Web Archiving Special Campaign and more than 200 Archive-It partner organizations built more than 300 new collections specifically about the global pandemic and its effects on their regions, institutions, and local communities. From colleges, universities, and governments documenting their own responses to community-driven initiatives like Sonoma County Library’s Sonoma Responds Community Memory Archive, a variety of information has been preserved and made available. These collections are critical historical records in and of themselves, and when taken in aggregate will allow researchers a comprehensive view into life during the pandemic.

Sonoma County Library’s Sonoma Responds: A Community Memory Archive encouraged community members to contribute content documenting their lives during the COVID-19 pandemic.

We have been exploring with partners ways to provide unified access to hundreds of individual COVID-related web collections created by Archive-It users. When the Institute of Museum and Library Services launched the American Rescue Plan grant program, that was part of the broader American Rescue Plan, a $1.9 trillion stimulus package signed into law on March 11, we applied and were awarded funding  to build a COVID-19 Web Archive access portal – a dedicated search and discovery access platform for COVID-19 web collections from hundreds of institutions.  The COVID-19 Web Archive will allow for browsing and full text search across diverse institutional collections and enable other access methods, including making datasets and code notebooks available for data analysis of the aggregate collections by scholars.  This work will support scholars, public health officials, and the general public in fully understanding the scope and magnitude of our historical moment now and into the future. The COVID-19 Web Archive is unique in that it will provide a unified discovery mechanism to hundreds of aggregated web archive collections built by a diverse group of over 200 libraries from over 40 US states and several other nations, from large research libraries to small public libraries to government agencies. If you would like your Archive-It collection or a portion of it included in the COVID-19 Web Archive, please fill out this interest form by Friday, April 29, 2022. If you are an institution in the United States that has COVID-related web archives collected outside of Archive-It or Internet Archive services that you are interested in having included in the COVID-19 Web Archive, please contact covidwebarchive@archive.org.

The post Building the Collective COVID-19 Web Archive appeared first on Internet Archive Blogs.

Meet the Librarians: Catherine Falls, Community Webs

Internet Archive - 6 april 2022 - 3:37am

To celebrate National Library Week 2022, we are taking readers behind the scenes to Meet the Librarians who work at the Internet Archive and in associated programs.

In the spring of 2021, Catherine Falls was hired by the Internet Archive to launch the Community Webs program in Canada. She was excited about the prospect of helping public libraries, museums, local historical societies and archives digitally preserve important material. 

Catherine Falls

“Most web archiving happens at really large institutions, so much of the experience of local communities is missing from the historic record. It’s giving us a biased view of contemporary society,” Falls said. “The more of these local organizations that we can get to do this archiving, the more the historic record will be brought into balance.”

Since her efforts began, the Internet Archive has partnered with 43 institutions and organizations in Canada to build community-based collections. Falls said it’s been rewarding to follow the growth and variety of web-archiving projects . For example, the Milton Public Library in Ontario is working with the Halton Black History Awareness Society and other organizations to document items that may not otherwise be captured on the web. Meanwhile, the ArQuives: Canada’s LGBTQ2+ Archives is working with its community members to build web archive collections that capture the community’s web presence.

Falls earned  bachelor’s degrees in commerce and art history from the University of British Columbia. She also has a master’s degree in library science and a master’s degree in art history from the University of Toronto. Before coming to the Internet Archive, she worked as an archivist in Canada at several institutions including York University and the Archives of Ontario.

“I’m interested in the free circulation of ideas and the library as a place where public knowledge is accessible.”

Catherine Falls, Community Webs

“I was drawn to libraries as a kind of place that facilitates research–which for me is the most exciting phase of any project,” Falls said. “I’m interested in the free circulation of ideas and the library as a place where public knowledge is accessible. I like how the intellectual possibilities of a library intersect with the library as a community space.”

Catherine Falls

Falls says her background gives her a solid understanding of the basic functions of the library and the common language used within the profession. With that theoretical grounding, she said she can approach her work from a critical perspective to make improvements. 

“It’s important to keep in mind that libraries are not infallible institutions. We need to be constantly questioning our practice and finding ways to be better,” Falls said. “It’s easy to say libraries are these beautiful, idyllic institutions. But I think it’s healthy to take a critical eye toward the work we do so that we can try to live up to our ideals in terms of whose stories we tell, who has access to our services, and what is preserved for the long term.” 

Falls said she enjoys the mission-driven focus of the Internet Archive. Operating in the library, technology and archival world, it has a dynamic, nimble culture that provides fertile ground in which to explore new ideas, she said. 

Falls’ favorite holdings are among some of the quirkier arts-related web archive collections in the Internet Archive: University of Michigan, School of Information, 20th Century Minimalist Music, Dalhousie University, Artist-Run Centres in Halifax, Nova Scotia, and Corning Museum of Glass, Contemporary Glass Podcasts.

The post Meet the Librarians: Catherine Falls, Community Webs appeared first on Internet Archive Blogs.

Pagina's

Abonneren op Informatiebeheer  aggregator - Beschikbaarstellen