This post describes how you can use a shell script for removed mistyped or invalid emails from the email list which you have downloaded from your email subscription provider. For example, if a user types his email with invalid characters like space, / , & etc then such emails are not valid and we have to remove such email ids from our list or also duplicated emails to avoid sending multiple emails to same user. This saves our cost to send emails to our subscribers and also to avoid sending emails to bounced emails.
We assume, we have following 3 files,
1. all-exported-emails.txt => this file contains list of all the emails as people have entered in subscribe form.
2. unsubscribed.txt => contains list of emails of people unsubscribed
3. bounced.txt => contains list of emails which we have already tried previously sending and have bounced.
$ vim create_final_email_list.sh
#!/bin/bash
sort -u all-exported-emails.txt > unique.txt
rm -rf final.txt
while read line
do
NEW_MAIL=$line
#remove unwanted character from email
if [ "$NEW_MAIL" == "${NEW_MAIL//[\,\' ]/}" ]
then
check=$(grep -r $NEW_MAIL unsubscribed.txt)
if [ "$check" = "$NEW_MAIL" ]; then
echo "email $NEW_MAIL found in unsubscribed.txt"
else
check2=$(grep -r $NEW_MAIL bounced.txt)
if [ "$check" = "$NEW_MAIL" ]; then
echo "email $NEW_MAIL found in bounced.txt"
else
echo $NEW_MAIL >> final.txt
fi
fi
else
echo "email $NEW_MAIL contains space, comma or quote.. hence ignoring"
fi
done < unique.txt
$ bash create_final_email_list.sh
The above script reads the EMAIL’s from all-exported-list.txt, sorts to remove the duplicate and then one by one read email and if same email is found in unsubscribed.txt or bounced.txt, it will be skipped from final email list, otherwise same email will be appeneded into final.txt text file.