如何在 haproxy 中缓存 (1)

📌 相关文章

📜 如何在 haproxy 中缓存 (1)

📅 最后修改于: 2023-12-03 15:08:37.538000 🧑 作者: Mango

如何在 HAProxy 中缓存

HAProxy 是一款开源的高性能负载均衡器，可以用于转发 HTTP/TCP 请求到后端服务器。在高并发场景下，HAProxy 常被用作缓存服务器，通过缓存静态资源或者频繁访问的动态资源来减轻后端服务器的压力。

本文将介绍如何在 HAProxy 中实现缓存功能，包括基于 URL 和请求头的缓存策略,缓存的过期和控制等方面。同时，我们也将探讨如何在 HAProxy 中监控和管理缓存数据以及多种缓存方案的选择。

所需软件和库

HAProxy 1.5 或更高版本
Lua 脚本库
Lua 编译器

HAProxy 缓存基础

HAProxy 的缓存机制基于 Lua 脚本，通过在请求前和请求后修改请求头和响应头来实现。HAProxy 的缓存功能需要通过 Lua 脚本库的支持，在加载配置文件时，HAProxy 在配置文件中加载 Lua 脚本并编译成字节码文件，然后在插入阶段和返回阶段执行 Lua 脚本。

在 HAProxy 中，缓存支持以下特性：

缓存过期时间：缓存记录在特定时间内未被访问将被删除。
缓存控制：由客户端或服务器端的 Cache-Control 或 Pragma 头控制的缓存。
Vary 缓存：由客户端或服务器端的 Vary 头控制的缓存。
基于 URL 的缓存：缓存请求的 URL 和响应。
基于请求头的缓存：根据请求头的信息选择缓存策略。
基于响应头的缓存：根据响应头的信息选择缓存策略。

基于 URL 的缓存

基于 URL 的缓存是一种简单的缓存策略，它仅仅根据请求的 URL 缓存响应。因此，对于相同 URL 的请求，HAProxy 可以直接返回之前缓存的响应，而不必转发请求到后端服务器。

例如，我们可以使用以下 Lua 脚本实现一个基于 URL 的缓存机制：

global counter = 0
global cache = {}

function update_cache(url, res)
    local obj = {}
    obj['r'] = res
    obj['t'] = os.time()
    cache[url] = obj
end

function create_key()
    counter = counter + 1
    return 'Key' .. counter
end

function cache_lookup()
    local key = create_key()
    local url = 'http://' .. req.hdr(Host) .. req.path
    if cache[url] == nil or (os.time() - cache[url]['t']) > 3600 then
        key = ''
    else
        key = 'cache:' .. url
    end
    return key
end

core.register_action('cache_request', {'http-req'}, function(txn)
    local cache_key = cache_lookup()
    if cache_key ~= '' then
        txn:done({
            status = 200,
            headers = {['content-type'] = cache['r'].hdr('Content-Type')},
            cache = true
        })
    end
end)

core.register_action('cache_response', {'http-res'}, function(txn)
    local cache_key = cache_lookup()
    if cache_key ~= '' then
        update_cache(cache_key, txn.res)
    end
end)

上面的 Lua 脚本定义了一个名为 cache_request 的操作，它将检查缓存中是否存在与当前请求匹配的 URL，如果是，则直接返回缓存的响应。如果缓存不存在或已过期，则放行请求并将响应记录到缓存中。

在 HAProxy 的配置文件中，我们将 cache_request 操作插入到 frontend 和 backend 部分：

frontend http-in
    bind *:80
    timeout client 1m
    mode http
    option httplog
    option dontlognull
    option http-server-close

    acl cached_request urlp_cache(cache) -m found

    http-request deny if cached_request
    http-request lua.cache_request

backend http-out
    mode http
    option httpchk HEAD / HTTP/1.0
    timeout server 1m
    timeout connect 5s
    server server1 127.0.0.1:8080

    http-response lua.cache_response

在上面的配置文件中，我们使用 acl 语句和 http-request deny 语句来拒绝缓存的请求，并使用 http-request lua.cache_request 操作查找缓存中的响应。在 backend 部分中，我们使用 http-response lua.cache_response 操作将响应添加到缓存以便后续的请求复用。

基于请求头的缓存

除了基于 URL 的缓存策略，HAProxy 还支持基于请求头的缓存策略。通过根据请求头的信息选择缓存策略，可以更灵活地控制缓存的过期和控制。

例如，我们可以使用以下 Lua 脚本实现一个基于请求头的缓存机制：

global counter = 0
global cache = {}

function update_cache(url, res)
    local obj = {}
    obj['r'] = res
    obj['t'] = os.time()
    cache[url] = obj
end

function create_key()
    counter = counter + 1
    return 'Key' .. counter
end

function get_request_hash()
    local user_agent = req.hdr('User-Agent')
    local accept_language = req.hdr('Accept-Language')
    return user_agent .. accept_language
end

function cache_lookup()
    local key = create_key()
    local url = 'http://' .. req.hdr(Host) .. req.path
    local hash = get_request_hash()
    if cache[url] == nil or
       (os.time() - cache[url]['t']) > 3600 or
       cache[url]['hash'] ~= hash then
        key = ''
    else
        key = 'cache:' .. url .. ':' .. hash
    end
    return key
end

core.register_action('cache_request', {'http-req'}, function(txn)
    local cache_key = cache_lookup()
    if cache_key ~= '' then
        txn:done({
            status = 200,
            headers = {['content-type'] = cache['r'].hdr('Content-Type')},
            cache = true
        })
    end
end)

core.register_action('cache_response', {'http-res'}, function(txn)
    local cache_key = cache_lookup()
    if cache_key ~= '' then
        update_cache(cache_key, txn.res)
    end
end)

上面的 Lua 脚本会计算请求头的哈希值，并使用 URL 和哈希值作为缓存键。如果哈希值不匹配或缓存已过期，则将请求转发到后端服务器。否则，返回缓存的响应。

缓存的过期和控制

除了基于 URL 和请求头的缓存策略外，HAProxy 还支持基于响应头的缓存策略。通过根据响应头的信息选择缓存策略，可以控制缓存的过期和控制。

例如，我们可以使用以下 Lua 脚本实现一个基于响应头的缓存机制：

global counter = 0
global cache = {}

function update_cache(url, res)
    local obj = {}
    obj['r'] = res
    obj['t'] = os.time()
    cache[url] = obj
end

function create_key()
    counter = counter + 1
    return 'Key' .. counter
end

function cache_lookup()
    local key = create_key()
    local url = 'http://' .. req.hdr(Host) .. req.path
    local hash = get_request_hash()
    if cache[url] == nil or
       (os.time() - cache[url]['t']) > 3600 or
       cache[url]['hash'] ~= hash or
       cache[url]['strange-header'] ~= nil then
        key = ''
    else
        key = 'cache:' .. url .. ':' .. hash
    end
    return key
end

core.register_action('cache_request', {'http-req'}, function(txn)
    local cache_key = cache_lookup()
    if cache_key ~= '' then
        txn:done({
            status = 200,
            headers = {['content-type'] = cache['r'].hdr('Content-Type')},
            cache = true
        })
    end
end)

core.register_action('cache_response', {'http-res'}, function(txn)
    local cache_key = cache_lookup()
    if cache_key ~= '' then
        update_cache(cache_key, txn.res)
    end
end)

上面的 Lua 脚本会检查响应头是否包含名为 strange-header 的头部字段。如果存在，则响应不会被缓存。否则，将缓存响应。

缓存管理和监控

HAProxy 的缓存除了提供性能优化的同时，还可以对缓存数据进行管理和监控。HAProxy 提供了一个名为 show cache <name> 的命令，用于查看缓存的状态，其中 <name> 是缓存的名称。

例如，我们可以使用以下命令查看缓存中的所有键值：

$ sudo haproxy -c -f /etc/haproxy/haproxy.cfg
$ sudo haproxy -D -f /etc/haproxy/haproxy.cfg
$ sudo haproxy -c -f /etc/haproxy/haproxy.cfg
$ sudo haproxy -vv
$ sudo haproxy
$ sudo socat /run/haproxy.sock stdio
$ show cache mycache

多种缓存方案的选择

在选择缓存方案时，需要考虑当前业务场景和服务器规模等各种因素。我们需要根据实际情况进行权衡和选择。

常见的缓存方案包括：

基于 URL 的缓存：适用于对相同 URL 的请求进行缓存，适用于缓存静态资源和一些频繁访问的动态资源。
基于请求头的缓存：适用于根据客户端的请求信息选择缓存策略。
基于响应头的缓存：适用于根据服务器的响应信息选择缓存策略。

总结

在 HAProxy 中使用缓存可以提高系统的性能和稳定性，减轻后端服务器的负载，并提供更灵活的缓存策略和管理。在选择缓存方案时，需要综合考虑业务场景、服务器规模和数据访问模式等各种因素。在实际应用中，可以采用多种策略结合的方式来达到最优的性能和可靠性。