📌  相关文章
📜  站点:*.instagram.com - Python (1)

📅  最后修改于: 2023-12-03 15:27:22.666000             🧑  作者: Mango

站点:*.instagram.com - Python

简介

Instagram是一款基于图片和视频分享的社交媒体应用,每个月拥有超过10亿的活跃用户。Python是一种流行的编程语言,拥有强大的现成模块和库,适用于各种不同类型的应用。本文介绍如何使用Python通过Instagram的API和爬虫来获取和处理数据。

Instagram API

Instagram 提供了一个开放的API,可以让开发者通过编程的方式访问他们的平台数据。Instagram API支持多种编程语言,其中包括Python。使用API,可以访问Instagram用户的个人信息、发布的内容、评论、点赞数据等。以下是通过API实现的一些功能:

获取用户信息

使用用户的用户名获取用户的信息,如姓名、ID、关注/粉丝数量等

import requests

username = "instagram"
url = f"https://www.instagram.com/{username}/?__a=1"

response = requests.get(url)
if response.ok:
    user_data = response.json()["graphql"]["user"]
    name = user_data["full_name"]
    followers = user_data["edge_followed_by"]["count"]
    following = user_data["edge_follow"]["count"]
    print(f"User: {username}\nName: {name}\nFollowers: {followers}\nFollowing: {following}")
else:
    print(f"Failed to get user data for {username}.")
获取用户发布的内容

使用用户的用户名获取用户发布的最新内容

import requests

username = "instagram"
url = f"https://www.instagram.com/{username}/?__a=1"

response = requests.get(url)
if response.ok:
    user_data = response.json()["graphql"]["user"]
    edges = user_data["edge_owner_to_timeline_media"]["edges"]
    
    for edge in edges:
        node = edge["node"]
        post_url = f"https://www.instagram.com/p/{node['shortcode']}/"
        caption = node.get("edge_media_to_caption", {}).get("edges", [{}])[0].get("node", {}).get("text", "")
        likes = node.get("edge_media_preview_like", {}).get("count", 0)
        print(f"Post URL: {post_url}\nCaption: {caption}\nLikes: {likes}\n")
else:
    print(f"Failed to get user data for {username}.")
Instagram 爬虫

除了使用Instagram API外,可以使用Python爬虫从Instagram网站上爬取用户数据。爬虫可以用来获取大量数据,但需要遵守Instagram的数据使用政策。以下是使用爬虫获取Instagram数据的示例:

获取用户信息

使用用户的用户名获取用户信息,如姓名、ID、关注/粉丝数量等

import requests
from bs4 import BeautifulSoup

username = "instagram"
url = f"https://www.instagram.com/{username}/"

response = requests.get(url)
if response.ok:
    soup = BeautifulSoup(response.text, "html.parser")
    user_data_script = soup.select_one("script[type='application/ld+json']").string
    user_data = json.loads(user_data_script)["mainEntityofPage"]
    name = user_data["name"]
    followers = user_data["interactionStatistic"][0]["userInteractionCount"]
    following = user_data["interactionStatistic"][1]["userInteractionCount"]
    print(f"User: {username}\nName: {name}\nFollowers: {followers}\nFollowing: {following}")
else:
    print(f"Failed to get user data for {username}.")
获取用户发布的内容

使用用户的用户名获取用户发布的最新内容

import requests
from bs4 import BeautifulSoup

username = "instagram"
url = f"https://www.instagram.com/{username}/"

response = requests.get(url)
if response.ok:
    soup = BeautifulSoup(response.text, "html.parser")
    posts = soup.select("div.v1Nh3 a")

    for post in posts:
        post_url = f"https://www.instagram.com{post['href']}"
        caption = post.select_one("div > span").text
        likes = post.select_one("div.Nm9Fw > button > span").text
        print(f"Post URL: {post_url}\nCaption: {caption}\nLikes: {likes}\n")
else:
    print(f"Failed to get user data for {username}.")
结语

使用Python可以获取和处理Instagram平台数据,可以通过API和爬虫两种方式。需要遵守Instagram的数据使用政策,以确保数据安全和合法。如果您想要使用Python来处理Instagram数据,可以根据需要选择使用API或爬虫来实现。