banner



How To Download A Url In Python And Save It To Json File

JSON is a common standard used by websites and APIs, and even natively supported by modern databases such as PostgreSQL. In this article, we'll present a tutorial on how to handle JSON data with Python. At present, let's start with the definition of JSON.

Navigation:

  • What is JSON?
  • JSON in Python
  • Converting JSON cord to Python object
  • Converting JSON file to Python object
  • Converting Python object to JSON string
  • Writing Python object to a JSON file
  • Converting custom Python objects to JSON objects
  • Creating Python class objects from JSON objects
  • Loading vs dumping
  • Conclusion

What is JSON?

JSON, or JavaScript Object Notation, is a format that uses text to store data objects. In other words, it is information structures for representing objects as text. Even though it's derived from JavaScript, it has go a de facto standard of transferring objects.

This format is supported by nigh popular programming languages, including Python. Most commonly, JSON is used to transfer information objects past APIs. Here is an example of a JSON string:

          {    "proper noun": "United states",    "population": 331002651,    "capital letter": "Washington D.C.",    "languages": [       "English language",       "Spanish"    ] }        

In this instance, JSON information looks like a Python dictionary. Just similar dictionaries, JSON contains data in cardinal-value pairs. All the same, the JSON data tin can as well exist a string, a number, a boolean, or a listing.

Before JSON became popular, XML had been the mutual pick to correspond data objects in a text format. Here is an case of the same data in the XML format:

          <?xml version="1.0" encoding="UTF-8"?> <land>    <name>Us</proper name>    <population>331002651</population>    <capital>Washington D.C.</capital>    <languages>        <language>English</linguistic communication>        <language>Spanish</linguistic communication>    </languages> </country>        

As axiomatic hither, JSON is lightweight. This is ane of the primary reasons why JSON is and then pop. If yous want to read more about the JSON standard, head over to the official JSON website.

JSON in Python

Python supports JSON data natively. The Python json module is part of the Standard Library. The json module can handle the conversion of JSON information from JSON format to the equivalent Python objects such as dictionary and list. The JSON package tin also convert Python objects into the JSON format.

The json module provides the functionality to write custom encoders and decoders, and at that place is no separate installation needed. Y'all can find the official documentation for the Python JSON module hither.

In the remainder of this tutorial, we will explore this parcel. We're going to convert JSON to dictionary and list and the other way round. We'll besides explore how to handle custom classes.

Converting JSON cord to Python object

JSON data is frequently stored in strings. This is a common scenario when working with APIs. The JSON data would be stored in string variables before it can exist parsed. As a effect, the virtually common task related to JSON is to parse the JSON string into the Python lexicon. The JSON module tin take care of this task easily.

The starting time step would be importing the Python json module. This module contains two important functions – loads and load.

Note that the showtime method looks like a plural form, only it is not. The letter of the alphabet 'South' stands for 'cord'.

The helpful method to parse JSON data from strings is loads. Note that it is read every bit 'load-due south'. The 's' stands for 'string' hither. The other method load is used when the data is in bytes. This is covered at length in a subsequently department.

Let's start with a simple example. The instance of JSON data is as follows:

          {    "name": "United States",    "population": 331002651, }        

JSON data can be stored as JSON cord earlier it is parsed. Even though we can use Python's triple quotes convention to store multi-line strings, we can remove the line breaks for readability.

          # JSON string country = '{"name": "United states of america", "population": 331002651}' impress(type(country))        

The output of this snippet volition confirm that this is indeed a JSON cord:

          <class 'str'>        

We can telephone call the json.loads() method and provide this string equally a parameter.

          import json  state = '{"name": "United States", "population": 331002651}' country_dict = json.loads(state)  print(type(country)) impress(blazon(country_dict))        

The output of this snippet will confirm that the JSON data, which was a cord, is now a Python dictionary.

          <class 'str'> <course 'dict'>        

This lexicon tin can exist accessed as usual:

          impress(country_dict['name']) # OUTPUT:   United states        

It is important to note hither that the json.loads() method volition not always render a lexicon. The data type that is returned will depend on the input cord. For example, this JSON string will return a list, not a lexicon.

          countries = '["United states", "Canada"]' counties_list= json.loads(countries)  impress(blazon(counties_list)) # OUTPUT:  <class 'list'>        

Similarly, if the JSON cord contains truthful, it will be converted to Python equivalent boolean value, which is True.

          import json   bool_string = 'true' bool_type = json.loads(bool_string) impress(bool_type) # OUTPUT:  True        

The following tabular array shows JSON objects and the Python data types after conversion. For more details, encounter Python docs.

JSON Python
object dict
array list
string str
number (integer) int
number (real) float
truthful True
false False
null None

Now, allow's move on to the next topic on parsing a JSON object to a Python object.

Converting JSON file to Python object

Reading JSON files to parse JSON data into Python information is very similar to how we parse the JSON information stored in strings. Autonomously from JSON, Python's native open() function will also exist required.

Instead of the JSON loads method, which reads JSON strings, the method used to read JSON data in files is load().

The load() method takes up a file object and returns the JSON data parsed into a Python object.

To get the file object from a file path, Python'southward open up() part can exist used.

Save the following JSON data every bit a new file and proper name it united_states.json:

          {    "name": "United States",    "population": 331002651,    "capital": "Washington D.C.",    "languages": [       "English",       "Castilian"    ] }        

Enter this Python script in a new file:

          import json  with open('united_states.json') as f:   information = json.load(f)  impress(type(data))        

Running this Python file prints the following:

          <form 'dict'>        

In this example, the open part returns a file handle, which is supplied to the load method.

This variable data contains the JSON as a Python dictionary. This ways that the dictionary keys can be checked as follows:

          print(data.keys()) # OUTPUT:  dict_keys(['name', 'population', 'capital', 'languages'])        

Using this data, the value of name can be printed as follows:

          data['name'] # OUTPUT:  United states of america        

In the previous 2 sections, we examined how JSON can be converted to Python objects. Now, it'due south time to explore how to convert Python objects to JSON.

Converting Python object to JSON string

Converting Python objects to JSON objects is also known equally serialization or JSON encoding. Information technology can exist achieved by using the function dumps(). It is read as dump-s and the letter Southward stands for string.

Here is a simple case.  Save this code in a new file as a Python script:

          import json  languages = ["English","French"] country = {     "proper name": "Canada",     "population": 37742154,     "languages": languages,     "president": None, }  country_string = json.dumps(country) print(country_string)        

When this file is run with Python, the following output is printed:

          {"name": "Canada", "population": 37742154, "languages": ["English", "French"],  "president": nil}        

The Python object is now a JSON object. This simple example demonstrates how easy information technology is to parse a Python object to a JSON object. Notation that the Python object was a dictionary. That's the reason it was converted into a JSON object blazon. Lists tin exist converted to JSON likewise. Here is the Python script and its output:

          import json  languages = ["English", "French"]  languages_string = json.dumps(languages) impress(languages_string) # OUTPUT:   ["English", "French"]        

It's non just limited to a dictionary and a list. string, int, float, bool and fifty-fifty None value tin can be converted to JSON.

Refer to the conversion table below for details. As you tin can see, simply the dictionary is converted to json object type. For the official documentation, encounter this link.

Python JSON
dict object
listing, tuple array
str string
int, float, int number
True true
False fake
None null

Writing Python object to a JSON file

The method used to write a JSON file is dump(). This method is very similar to the method dumps(). The just difference is that while dumps() returns a string, dump() writes to a file.

Here is a simple demonstration. This will open the file in writing mode and write the data in JSON format. Relieve this Python script in a file and run it.

          import json  # Tuple is encoded to JSON array. languages = ("English language", "French") # Lexicon is encoded to JSON object. country = {     "proper name": "Canada",     "population": 37742154,     "languages": languages,     "president": None, }  with open up('countries_exported.json', 'w') every bit f:     json.dump(land, f)        

When this code is executed using Python, countries_exported.json file is created (or overwritten) and the contents are the JSON.

However, you volition notice that the entire JSON is in ane line. To make it more readable, nosotros can laissez passer one more parameter to the dump() part equally follows:

          json.dump(state, f, indent=4)        

This time when you run the code, it will be nicely formatted with indentation of 4 spaces:

          {     "languages": [         "English",          "French"     ],      "president": null,      "proper name": "Canada",      "population": 37742154 }        

Annotation that this indent parameter is also available for JSON dumps() method. The just difference between the signatures of JSON dump() and JSON dumps() is that dump() needs a file object.

Converting custom Python objects to JSON objects

Let'southward examine the signature of dump() method:

          dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True,allow_nan=True, cls=None, indent=None, separators=None,default=None, sort_keys=Fake, **kw)        

Let's focus on the parameter cls.

If no Class is supplied while calling the dump method, both the dump() and dumps() methods default to the JSONEncoder course. This class supports the standard Python types: dict, list, tuple, str, int, bladder, Truthful, False, and None.

If we try to call the json.loads() method on whatever other type, this method will raise a TypeError with a message: Object of blazon <your_type> is not JSON serializable.

Save the following code as a Python script and run it:

          import json  class Country:     def __init__(self, proper noun, population, languages):         cocky.name = proper noun             self.population = population         cocky.languages = languages       canada = Country("Canada", 37742154, ["English language", "French"])  print(json.dumps(canada)) # OUTPUT:   TypeError: Object of type Country is not JSON serializable        

To convert the objects to JSON, we need to write a new class that extends JSONEncoder. In this class, the method default() should be implemented. This method will have the custom code to return the JSON.

Hither is the instance Encoder for the Country grade. This class will help converting a Python object to a JSON object:

          import json    form CountryEncoder(json.JSONEncoder):     def default(cocky, o):          if isinstance(o, Country):            # JSON object would be a lexicon. 		return {                 "proper name" : o.proper name,                 "population": o.population,                 "languages": o.languages             }          else:             # Base of operations class will raise the TypeError.             return super().default(o)        

This code is just returning a dictionary, after confirming that the supplied object is an instance of Country class, or calling the parent to handle the residuum of the cases.

This class can now be supplied to the json.dump() as well as json.dumps() methods.

          impress(json.dumps(canada, cls=CountryEncoder)) # OUTPUT:  {"name": "Canada", "population": 37742154, "languages": ["English", "French"]}        

Creating Python grade objects from JSON objects

Then far, we have discussed how json.load() and json.loads() methods tin create a lexicon, list, and more. What if nosotros want to read a JSON object and create a custom class object?

In this section, we will create a custom JSON Decoder that will assist u.s. create custom objects. This custom decoder volition let u.s. to use the json.load() and json.loads() methods, which will return a custom class object.

We will piece of work with the same Land class that we used in the previous section. Using a custom encoder, we were able to write code like this:

          # Create an object of course Country canada = Country("Canada", 37742154, ["English language", "French"]) # Utilise json.dump() to create a JSON file in writing mode with open('canada.json','w') as f:     json.dump(canada,f, cls=CountryEncoder)        

If we try to parse this JSON file using the json.load() method, nosotros will get a dictionary:

          with open up('canada.json','r') as f:     country_object = json.load(f) # OUTPUT:  <type 'dict'>        

To get an case of the Land class instead of a lexicon, we need to create a custom decoder. This decoder class will extend JSONDecoder. In this class, we volition be writing a method that will be object_hook. In this method, nosotros will create the object of Country grade by reading the values from the dictionary.

Apart from writing this method, nosotros would too need to call the __init__ method of the base course and set up the value of the parameter object_hook to this method proper name. For simplicity, nosotros can utilise the same name.

          import json   class CountryDecoder(json.JSONDecoder):     def __init__(self, object_hook=None, *args, **kwargs):         super().__init__(object_hook=cocky.object_hook, *args, **kwargs)      def object_hook(cocky, o):         decoded_country =  Country(             o.get('proper noun'),              o.become('population'),              o.get('languages'),         )         return decoded_country        

Notation that nosotros are using the .get() method to read dictionary keys. This will ensure that no errors are raised if a key is missing from the dictionary.

Finally, we can telephone call the json.load() method and set the cls parameter to CountryDecoder class.

          with open('canada.json','r') as f:     country_object = json.load(f, cls=CountryDecoder)  impress(type(country_object)) # OUTPUT:  <class 'Country'>        

That's it! Nosotros now have a custom object created directly from JSON.

Loading vs dumping

The Python JSON module has 4 key functions: read(), reads(), load(), and loads(). It often becomes confusing to remember these functions. The most important affair to remember is that the alphabetic character 'S' stands for Cord. Likewise, read the letter 's' separately in the functions loads() and dumps(), that is,  read loads every bit load-s and read dumps() as dump-south.

Here is a quick tabular array to aid you think these functions:

File String
Read load() loads()
Write dump() dumps()

Conclusion

In this tutorial, we explored reading and writing JSON information using Python. Knowing how to piece of work with JSON data is essential, particularly when working with websites. JSON is used to transfer and store data everywhere, including APIs, web scrapers, and modern databases similar PostgreSQL.

Understanding JSON is crucial if yous are working on a web scraping project that involves dynamic websites. Caput over to our blog post for a practical case of JSON being useful for pages with infinite gyre.

avatar

Virtually Monika Maslauskaite

Monika Maslauskaite is a Content Director at Oxylabs. A combination of tech-world and content creation is the thing she is super passionate almost in her professional path. While gratuitous of piece of work, you'll find her watching mystery, psychological (basically, all kinds of mind-blowing) movies, dancing, or just making up choreographies in her head.

All information on Oxylabs Weblog is provided on an "as is" basis and for advisory purposes only. Nosotros brand no representation and disclaim all liability with respect to your apply of any information contained on Oxylabs Blog or any third-political party websites that may be linked therein. Earlier engaging in scraping activities of any kind you should consult your legal advisors and carefully read the detail website's terms of service or receive a scraping license.

Source: https://oxylabs.io/blog/python-parse-json

Posted by: bergerontatied.blogspot.com

0 Response to "How To Download A Url In Python And Save It To Json File"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel