一種整理HTML和JS代碼的方法

tidy可以整理HTML但不動里面的JS代碼。prettier可以整理JS代碼，它能不能整理HTML+JS呢？

我寫完兩個程序后才發現原來可以啊。不過還是把破程序貼出來吧，再說也許發現了prettier的一個bug.

get-js.py

from bs4 import BeautifulSoup as BS
import sys

bs = BS(open(sys.argv[1], 'r'), 'html.parser')
n = 0
for t in bs.find_all('script'):
  s = t.string
  if s is None: continue
  with open(f'{n:02d}.js', 'w') as f:
    print(s, file=f, end='')
  n += 1

View Code

rm-js.py

from html.parser import HTMLParser
import sys

class ScriptRemover (HTMLParser):
  T = 'script'

  def __init__(m):
    super().__init__(); m.in_script = False

  @staticmethod
  def ta(t, a):
    s = '<' + t
    if len(a): s += ' ' + ' '.join(f'{k}="{v}"' for k,v in a)
    return s

  def handle_starttag(m, t, a):
    print(f'{m.ta(t,a)}>', end='')
    if t.lower() == m.T: m.in_script = True

  def handle_endtag(m, t):
    print(f'</{t}>', end='')
    if t.lower() == m.T: m.in_script = False

  def handle_data(m, data):
    if not m.in_script: print(data, end='')

  def handle_startendtag(m, t, a): print(f'{m.ta(t,a)}/>', end='')

ScriptRemover().feed(sys.stdin.read())

View Code

JS里裸放個JSON，prettier說語法錯誤。該JSON用別的工具們驗證沒問題：瀏覽器加載JS控制臺無錯誤信息,Python的json.load()成功。改成形如x={"age":0}后prettier不報錯。

# apt install tidy

# man tidy; -w --width

# apt install nodejs npm
# npm install -g prettier -g --global為所有用戶安裝

# prettier -h --help -c --check -w --write

瀏覽器不會把<script>作為script對待，而是作為普通文本顯示<script>

BeautifulSoup(), param features: Desirable features of the parser to be used. This may be the name of a specific parser ("lxml", "lxml-xml", "html.parser", or "html5lib") or it may be the type of markup to be used ("html", "html5", "xml"), 實測"html"不行，"html.parser"可以。

BeautifulSoup.find_all(self, name=None, attrs={}, recursive=True, string=None, limit=None, **kwargs)，看見有人寫find_all(True)，不知何意。

試了下：

def find_all(self, name=None, attrs={}, recursive=True, string=None, limit=None, **kwargs):
  print(f'{name}, {attrs} {recursive}')
find_all(1, True, recursive=False)

True, {} False

'' == True是False. '' == False也是False

自定義類重載了__eq__時，使用==判斷None會出錯。{} '' 0都是False.

>>> '' is None
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
False

posted @ 2025-10-11 21:02 華容道專家閱讀(9) 評論(0) 收藏舉報

刷新頁面返回頂部

Penilum meum pullo sententia Latin a est 「通過浪費時間獲得快樂」

一種整理HTML和JS代碼的方法