How to Scrape LinkedIn Comments (Tools, Code & Ethics)

Junaid Khalid
Contents
You're launching a product, analyzing competitors, or researching your target audience's pain points. LinkedIn comments contain goldmines of unfiltered feedback, questions, and objections.
But manually copying hundreds of comments is tedious. You need the data systematically extracted, organized, and ready for analysis.
This is where LinkedIn comment scraping comes in. But before you start extracting data, you need to understand three critical dimensions: what's technically possible, what's legally permissible, and what's ethically sound.
This guide covers all three. You'll learn the tools, the code, and the boundaries. By the end, you'll know how to extract LinkedIn comment data responsibly (or why you might choose alternative approaches entirely).
What is LinkedIn Comment Scraping?
Comment scraping means programmatically extracting comments from LinkedIn posts and converting them into structured data you can analyze.
Why scrape LinkedIn comments?
Research and competitive analysis: Understand what questions prospects ask on competitor posts. Identify unmet needs. Discover language patterns your target audience uses.
Example: A SaaS founder scrapes comments on competitors' feature announcement posts to discover what users are requesting, complaining about, or praising.
Sentiment analysis on engagement: Track sentiment trends across your industry. Are people excited about new regulations or concerned? Which topics generate positive vs. negative reactions?
Example: A marketing agency scrapes comments on industry news posts to gauge professional sentiment about remote work policies, informing their content strategy.
Lead generation and prospecting: Identify engaged commenters on strategic posts. Someone who writes thoughtful comments on industry content is an active LinkedIn user likely to respond to connection requests.
Example: A business consultant scrapes comments from top leadership coaches' posts, then reaches out to engaged commenters with personalized connection requests.
Influencer identification: Find active participants in your niche. Who comments frequently? Whose comments get the most replies? These are micro-influencers worth building relationships with.
Content strategy insights: Analyze which comment types get the most engagement. Do questions outperform statements? Do people prefer data-driven comments or personal stories?
Is Scraping LinkedIn Comments Legal?
The legality of web scraping exists in gray areas. The answer depends on how you scrape, what you scrape, and what you do with the data.
LinkedIn's Terms of Service on scraping: LinkedIn's User Agreement (Section 8.2) explicitly prohibits scraping:
"You agree that you will not... use bots or other automated methods to access the Services, add or download contacts, send or redirect messages, or perform other activities through the Services."
This is unambiguous. LinkedIn's official position is: don't scrape our platform.
Notable legal cases (hiQ Labs vs. LinkedIn):
The most important scraping precedent came from hiQ Labs v. LinkedIn (2019-2022). Key developments:
- hiQ's position: They argued that scraping publicly visible data (including comments) should be legal, as it's already publicly accessible
- LinkedIn's position: Even public data is protected by the Computer Fraud and Abuse Act (CFAA) when accessed via scraping
- Initial ruling (2019): Ninth Circuit sided with hiQ, stating that scraping publicly accessible data doesn't violate CFAA
- Final outcome (2022): Case remanded and ultimately settled, leaving the legal question partially unresolved
Current legal consensus (as of 2025):
- Scraping public data is in a legal gray area, not clearly illegal but not clearly legal
- Scraping data behind login walls is riskier legally
- The issue is more about LinkedIn's Terms of Service (a contract violation) than criminal law
- LinkedIn actively pursues legal action against commercial scraping operations
Ethical scraping practices:
Even if something is technically legal or in a gray area, ethical considerations matter:
-
Respect user privacy: Comments are public, but users didn't consent to mass data extraction. Avoid exposing personally identifiable information (PII) in research outputs.
-
Don't sell or misuse scraped data: Extracting comments for market research differs from selling email lists scraped from profiles.
-
Respect rate limits: Excessive scraping burdens LinkedIn's servers. Implement delays between requests.
-
Consider alternatives first: Does the LinkedIn API provide what you need? Can you collect data manually for smaller projects? Scraping should be a last resort, not the first approach.
-
Be transparent: If publishing research based on scraped LinkedIn data, disclose your methodology.
When to use the official LinkedIn API instead:
LinkedIn provides APIs for legitimate business use cases:
- LinkedIn Marketing API: For advertising and sponsored content analytics
- LinkedIn Consumer API: For authorized integrations (limited access)
- LinkedIn Sales Navigator: For lead generation within their ecosystem
API limitations for comment data:
- Most LinkedIn APIs don't provide broad comment access
- Comment data access requires special partnerships
- Rate limits are strict
- Approval process is lengthy and selective
For most use cases (competitive research, sentiment analysis), LinkedIn won't grant API access. This forces the choice between manual collection or scraping.
Our recommendation: For commercial projects or ongoing data collection, the legal and reputational risks of scraping may outweigh benefits. For one-time research projects with small datasets, scraping public comments carries lower risk but isn't risk-free.
Methods to Scrape LinkedIn Comments
Four approaches exist, ranging from completely manual to fully automated.
Method 1: Manual Copy-Paste (Small Scale)
What it involves: Manually copying comments from posts and pasting them into a spreadsheet.
When to use it:
- Analyzing 50-100 comments from a handful of posts
- One-time research projects
- When you want complete safety from ToS violations
- Academic research requiring careful context preservation
Pros:
- Zero technical skills required
- No scraping software needed
- Complete control over what data you collect
- No risk of account restrictions
Cons:
- Extremely time-consuming
- Doesn't scale beyond small projects
- Manual errors likely
- Difficult to maintain consistency
Process:
- Open target LinkedIn post
- Scroll to load all comments (click "Load more comments")
- Copy each comment into spreadsheet columns: commenter name, comment text, timestamp, likes
- Repeat for each post
Time estimate: 3-5 minutes per post (10-15 comments per post)
Method 2: Browser Extensions & Tools
What it involves: Using third-party browser extensions or desktop applications designed for LinkedIn scraping.
Popular tools:
- Phantombuster (cloud-based automation)
- Octoparse (no-code scraping)
- Browser extensions like "LinkedIn Comment Extractor"
When to use it:
- Moderate volume (100-500 comments)
- Non-technical users
- When you need occasional scraping capability
- Situations where paying for a tool is acceptable
Pros:
- No coding required
- Point-and-click interface
- Handles authentication
- Often includes CSV export
Cons:
- Costs $50-150/month for robust tools
- Still violates LinkedIn ToS
- Limited customization
- May stop working if LinkedIn changes their site structure
Method 3: Python + Selenium/BeautifulSoup
What it involves: Writing custom Python scripts using libraries like Selenium (browser automation) or BeautifulSoup (HTML parsing).
When to use it:
- Large-scale projects (1,000+ comments)
- Need for customization
- Technical users comfortable with code
- Ongoing data collection needs
Pros:
- Complete control over scraping logic
- Free (just your time investment)
- Can integrate with data analysis pipelines
- Customizable for specific use cases
Cons:
- Requires programming knowledge
- Time-consuming to set up initially
- Breaks when LinkedIn changes their HTML structure
- Still violates LinkedIn ToS
- Risk of IP bans if not careful
We'll cover this method in detail in the next section.
Method 4: Third-Party Scraping Services
What it involves: Hiring companies that provide "LinkedIn data as a service" or freelancers who scrape on your behalf.
When to use it:
- You need large datasets but lack technical skills
- One-time project where outsourcing makes sense
- Want to distance your personal LinkedIn account from scraping
Pros:
- No technical work required
- Vendors handle infrastructure
- Typically deliver cleaned, structured data
Cons:
- Expensive ($500-5,000+ depending on scale)
- Quality varies significantly
- You're still responsible for how you use the data
- Legal liability unclear
- No guarantee of data freshness or accuracy
Comparison table:
| Method | Cost | Skill Level | Scale | Risk Level | Time Investment |
|---|---|---|---|---|---|
| Manual Copy-Paste | Free | None | 50-100 comments | Very Low | Very High |
| Browser Tools | $50-150/mo | Low | 100-500 comments | Medium | Low |
| Python Scripts | Free | High | 1,000+ comments | Medium-High | High (setup), Low (ongoing) |
| Scraping Services | $500-5,000 | None | Unlimited | High | Very Low |
How to Scrape LinkedIn Comments with Python (Tutorial)
This section provides a technical walkthrough for developers comfortable with Python.
Important disclaimer: This tutorial is for educational purposes. Scraping LinkedIn violates their Terms of Service. Use this knowledge responsibly and understand the risks.
Prerequisites: Installing Selenium & Chrome Driver
What you'll need:
- Python 3.8 or higher
- Chrome browser
- Basic familiarity with Python and command line
Installation steps:
# Install required Python packages
pip install selenium beautifulsoup4 pandas
# Install webdriver-manager for automatic ChromeDriver management
pip install webdriver-manager
Why Selenium? LinkedIn's content is dynamically loaded via JavaScript. Tools like BeautifulSoup alone can't access this content. Selenium automates a real browser, allowing access to JavaScript-rendered content.
Step 1: Authenticate with LinkedIn
Selenium needs to authenticate as you to access LinkedIn content.
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
import time
# Set up Chrome driver
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
# Navigate to LinkedIn login
driver.get('https://www.linkedin.com/login')
# Wait for manual login
input("Log in to LinkedIn in the browser window, then press Enter here...")
print("Logged in successfully!")
Important: This script requires you to manually log in. Storing credentials in code is insecure and may trigger LinkedIn's security alerts.
Step 2: Navigate to Target Post
# URL of the LinkedIn post you want to scrape comments from
post_url = "https://www.linkedin.com/posts/username_activity-123456789"
driver.get(post_url)
time.sleep(3) # Wait for page to load
Step 3: Scroll to Load All Comments
LinkedIn uses "lazy loading" - comments appear as you scroll. You need to scroll repeatedly to load all comments.
from selenium.webdriver.common.keys import Keys
def scroll_to_load_comments(driver, max_scrolls=20):
"""Scroll down repeatedly to load all comments"""
body = driver.find_element(By.TAG_NAME, 'body')
for i in range(max_scrolls):
# Scroll down
body.send_keys(Keys.PAGE_DOWN)
time.sleep(2) # Wait for content to load
# Try to click "Show more comments" button if it exists
try:
show_more_button = driver.find_element(
By.XPATH,
"//button[contains(@class, 'comments-comments-list__show-all-link')]"
)
show_more_button.click()
time.sleep(2)
except:
pass # Button doesn't exist or already clicked
print(f"Finished scrolling after {max_scrolls} scrolls")
# Execute scrolling
scroll_to_load_comments(driver)
Note: max_scrolls=20 limits scrolling to prevent infinite loops. Adjust based on expected comment volume.
Step 4: Extract Comment Text, Author, Timestamp
from bs4 import BeautifulSoup
def extract_comments(driver):
"""Extract all comment data from the loaded page"""
# Get page HTML
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
# Find all comment containers
# Note: LinkedIn's HTML structure changes frequently
# These selectors may need updating
comment_elements = soup.find_all('article', class_='comments-comment-item')
comments_data = []
for comment in comment_elements:
try:
# Extract author name
author = comment.find('span', class_='comments-post-meta__name-text').get_text(strip=True)
# Extract comment text
comment_text = comment.find('span', class_='comments-comment-item__main-content').get_text(strip=True)
# Extract timestamp
timestamp = comment.find('time', class_='comments-comment-item__timestamp')
timestamp_text = timestamp.get_text(strip=True) if timestamp else "N/A"
# Extract like count (if available)
like_element = comment.find('button', {'aria-label': lambda x: x and 'Like' in x})
likes = 0
if like_element:
like_text = like_element.get_text(strip=True)
# Parse like count from text like "5 Likes"
likes = int(''.join(filter(str.isdigit, like_text))) if any(char.isdigit() for char in like_text) else 0
comments_data.append({
'author': author,
'comment': comment_text,
'timestamp': timestamp_text,
'likes': likes
})
except Exception as e:
print(f"Error parsing comment: {e}")
continue
return comments_data
# Extract all comments
comments = extract_comments(driver)
print(f"Extracted {len(comments)} comments")
Important: LinkedIn frequently changes their HTML class names and structure to discourage scraping. This code will likely need adjustments when LinkedIn updates their interface.
Step 5: Export to CSV/JSON
import pandas as pd
import json
def export_comments(comments_data, format='csv'):
"""Export comments to CSV or JSON"""
if format == 'csv':
df = pd.DataFrame(comments_data)
df.to_csv('linkedin_comments.csv', index=False)
print("Comments exported to linkedin_comments.csv")
elif format == 'json':
with open('linkedin_comments.json', 'w', encoding='utf-8') as f:
json.dump(comments_data, f, indent=2, ensure_ascii=False)
print("Comments exported to linkedin_comments.json")
# Export in both formats
export_comments(comments, format='csv')
export_comments(comments, format='json')
# Close the browser
driver.quit()
Full Python Code Example
Here's the complete script:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
import pandas as pd
import time
def setup_driver():
"""Initialize Chrome driver"""
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
return driver
def login_to_linkedin(driver):
"""Navigate to LinkedIn and wait for manual login"""
driver.get('https://www.linkedin.com/login')
input("Log in to LinkedIn, then press Enter...")
def scroll_to_load_comments(driver, max_scrolls=20):
"""Scroll to load all comments"""
body = driver.find_element(By.TAG_NAME, 'body')
for i in range(max_scrolls):
body.send_keys(Keys.PAGE_DOWN)
time.sleep(2)
try:
show_more = driver.find_element(
By.XPATH,
"//button[contains(text(), 'more comment')]"
)
show_more.click()
time.sleep(2)
except:
pass
def extract_comments(driver):
"""Extract comment data"""
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
comments = []
comment_elements = soup.find_all('article', class_='comments-comment-item')
for comment in comment_elements:
try:
author = comment.find('span', class_='comments-post-meta__name-text').get_text(strip=True)
text = comment.find('span', class_='comments-comment-item__main-content').get_text(strip=True)
timestamp = comment.find('time')
timestamp_text = timestamp.get_text(strip=True) if timestamp else "N/A"
comments.append({
'author': author,
'comment': text,
'timestamp': timestamp_text
})
except:
continue
return comments
def main():
"""Main scraping workflow"""
# Setup
driver = setup_driver()
login_to_linkedin(driver)
# Navigate to post
post_url = input("Enter LinkedIn post URL: ")
driver.get(post_url)
time.sleep(3)
# Load and extract comments
print("Loading comments...")
scroll_to_load_comments(driver)
print("Extracting comments...")
comments = extract_comments(driver)
# Export
df = pd.DataFrame(comments)
df.to_csv('linkedin_comments.csv', index=False)
print(f"Exported {len(comments)} comments to linkedin_comments.csv")
driver.quit()
if __name__ == "__main__":
main()
To run this script:
- Save as
scrape_linkedin_comments.py - Run:
python scrape_linkedin_comments.py - Log in when prompted
- Enter the LinkedIn post URL
- Wait for extraction to complete
Analyzing Scraped LinkedIn Comment Data
Once you have comment data, here's how to extract insights.
Sentiment Analysis with NLP Tools
from textblob import TextBlob
import pandas as pd
def analyze_sentiment(comment_text):
"""Analyze sentiment of a comment"""
blob = TextBlob(comment_text)
polarity = blob.sentiment.polarity # -1 (negative) to 1 (positive)
if polarity > 0.1:
return 'Positive'
elif polarity < -0.1:
return 'Negative'
else:
return 'Neutral'
# Load your scraped comments
df = pd.read_csv('linkedin_comments.csv')
# Add sentiment column
df['sentiment'] = df['comment'].apply(analyze_sentiment)
# Analyze distribution
print(df['sentiment'].value_counts())
Example output:
Positive 45
Neutral 32
Negative 8
This reveals overall sentiment on a post or across multiple posts.
Identifying Top Commenters & Influencers
# Find most active commenters
top_commenters = df['author'].value_counts().head(10)
print("Most Active Commenters:")
print(top_commenters)
# Find comments with most engagement (likes)
top_comments = df.nlargest(10, 'likes')[['author', 'comment', 'likes']]
print("\nMost Liked Comments:")
print(top_comments)
This identifies micro-influencers worth connecting with or partnering with.
Engagement Pattern Analysis
# Analyze comment length vs engagement
df['word_count'] = df['comment'].apply(lambda x: len(x.split()))
# Correlation between length and likes
correlation = df[['word_count', 'likes']].corr()
print("Correlation between comment length and likes:")
print(correlation)
# Average engagement by comment length bracket
df['length_category'] = pd.cut(df['word_count'], bins=[0, 20, 50, 100, 500], labels=['Short', 'Medium', 'Long', 'Very Long'])
engagement_by_length = df.groupby('length_category')['likes'].mean()
print("\nAverage likes by comment length:")
print(engagement_by_length)
This reveals whether your audience prefers concise comments or detailed responses.
Keyword & Topic Clustering
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans
# Extract key topics using TF-IDF
vectorizer = TfidfVectorizer(max_features=20, stop_words='english')
tfidf_matrix = vectorizer.fit_transform(df['comment'])
# Get top keywords
feature_names = vectorizer.get_feature_names_out()
print("Top Keywords in Comments:")
print(feature_names)
# Cluster comments into topics
kmeans = KMeans(n_clusters=5, random_state=42)
df['topic_cluster'] = kmeans.fit_predict(tfidf_matrix)
# Analyze clusters
for cluster_id in range(5):
cluster_comments = df[df['topic_cluster'] == cluster_id]
print(f"\nCluster {cluster_id} ({len(cluster_comments)} comments):")
print(cluster_comments['comment'].head(3))
This automatically groups comments by theme, revealing what topics dominate the conversation.
Alternatives to Scraping: LinkedIn API
Before scraping, explore official alternatives.
Official LinkedIn API capabilities:
- LinkedIn Marketing API: Access campaign performance, sponsored content analytics
- LinkedIn Consumer API: Limited access for member profile data (with user authorization)
- LinkedIn Lead Gen Forms API: Access leads from LinkedIn ad campaigns
Limitations for comment data:
- No public API for extracting comments from posts
- Comment access requires special partnership agreements
- Most API access is restricted to LinkedIn's advertising customers
When API access is required:
- Building a product that integrates with LinkedIn
- Running ongoing data collection for commercial purposes
- Need for reliable, long-term data access
How to apply for API access:
- Create a LinkedIn App via LinkedIn Developers
- Submit detailed use case explanation
- Wait for approval (can take weeks to months)
- Most applications are rejected unless you're an established company with clear value proposition
Reality: For comment analysis, LinkedIn won't grant API access in most cases. This forces the choice between manual collection, scraping (with risks), or alternative approaches.
Best Practices & Safety Tips
If you proceed with scraping despite the risks, follow these guidelines to minimize issues.
Use rate limiting to avoid detection:
import random
import time
def polite_scrape(urls):
"""Scrape with delays to avoid detection"""
for url in urls:
# Random delay between 3-7 seconds
delay = random.uniform(3, 7)
time.sleep(delay)
# Your scraping logic here
scrape_post(url)
LinkedIn monitors request frequency. Human-like delays (3-10 seconds between actions) reduce detection risk.
Respect user privacy & GDPR compliance:
- Don't collect personally identifiable information (PII) beyond what's publicly visible
- If scraping from EU users, understand GDPR implications
- Don't sell or share scraped data containing personal information
- Anonymize data when possible for research outputs
Don't sell or misuse scraped data:
- Using scraped comments for market research (internal use) carries different ethical weight than selling scraped data
- Spamming commenters with cold outreach is both unethical and ineffective
Consider using official APIs when possible:
- Always check if an official API meets your needs before scraping
- APIs provide reliable, legal data access
- Scraping should be a last resort, not the default approach
Warning signs you're scraping too aggressively:
- LinkedIn sends security alerts
- Your account gets temporarily restricted
- Comments stop loading (rate limited)
- LinkedIn adds CAPTCHA challenges
If you see any of these, stop immediately and wait 24-48 hours before resuming at lower volume.
Should You Scrape LinkedIn Comments? (Our Verdict)
When scraping makes sense:
- One-time academic research projects with small datasets
- Personal learning and experimentation
- Competitive research where alternative data sources don't exist
- Situations where you're willing to accept potential account restrictions
When to avoid scraping:
- Commercial products or services depending on scraped data
- Ongoing, large-scale data collection
- Situations where your LinkedIn account is valuable and account restrictions would be costly
- Any use case where legal liability is a concern
Better alternatives to consider:
1. Manual collection for small projects: If you need data from 5-10 posts, manual copy-paste is safer and nearly as fast as scripting setup.
2. Survey your audience instead: Want to understand pain points? Ask directly via LinkedIn polls or surveys rather than inferring from scraped comments.
3. Use LigoAI for engagement instead of extraction: If your goal is understanding audience sentiment to improve your engagement strategy, using LigoAI to engage thoughtfully generates similar insights through direct interaction - without the legal and ethical risks of scraping.
4. LinkedIn Sales Navigator: For lead generation, Sales Navigator provides filtered prospect lists legally and reliably.
5. Social listening tools: Platforms like Brandwatch or Sprout Social offer LinkedIn monitoring capabilities through official partnerships.
The reality of scraping in 2025: LinkedIn's anti-scraping measures are more sophisticated than ever. Detection algorithms identify bot-like behavior. Legal precedents remain unclear. The risks have increased while alternatives have improved.
For most professionals, the risk-reward calculation doesn't favor scraping. The exceptions are researchers with one-time projects and technical users willing to accept potential account consequences.
Start Engaging Strategically Instead
If your goal is understanding LinkedIn conversations to improve your own engagement, there's a safer approach: participate actively using AI assistance.
Why engagement beats extraction:
- You generate insights while building relationships
- Zero legal or ethical concerns
- LinkedIn rewards active participants with increased visibility
- More sustainable long-term strategy
Using LigoAI to understand your audience:
Instead of scraping comments to analyze sentiment, use LigoAI to engage with posts in your niche. As you engage, you'll naturally understand:
- What topics resonate with your audience
- Which questions come up repeatedly
- What language and tone works best
- Who the active participants are
You gather the same insights through participation rather than observation. And you build your professional brand in the process.
The bottom line: Scraping is technically possible but legally risky and ethically questionable. For most use cases, better alternatives exist that don't jeopardize your LinkedIn account or create legal exposure.
Choose the approach that aligns with your values and risk tolerance. When in doubt, engage rather than extract.
Related Resources
- Automate LinkedIn Comments: 5 Tools to Save Time in 2025
- LinkedIn Comment Impressions: The Complete Guide for Agency Owners and Consultants (2025)
- How to Write a Good LinkedIn Comment: 7-Step Formula for Agency Owners
- 5 Ways to Use AI Comments on LinkedIn Without Sounding Like a Robot
- LinkedIn Comment Examples: Professional Responses for Every Situation

About the Author
Junaid Khalid
I have helped 50,000+ professionals with building a personal brand on LinkedIn through my content and products, and directly consulted dozens of businesses in building a Founder Brand and Employee Advocacy Program to grow their business via LinkedIn